Skip to content

python script for ETL job with pyspark, both the source and destination databses are MySQL (the spark job is embedded into flask for the sake of deployment)

Notifications You must be signed in to change notification settings

Reef-world/ETL-with-pyspark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

in this project i developed ETL job using pyspark

after establishing the connection with the source database (MySQL) the ETL job starts

three stages to the etl:

1- extracting the data from the tables in the database to dataframes

2- apply transformation rules to these dataframes and check the rule is valid

3-connect and load the tranformed data into the destination database (Also MySQL, already manually structured)

About

python script for ETL job with pyspark, both the source and destination databses are MySQL (the spark job is embedded into flask for the sake of deployment)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages