Extract, Transform, Load (ETL) is a process to process various data sources to be targeted data sources. ETL is one of required skill in data science to implement pre-processing and/or post-processing. This workshop is designed for anyone who wants to improve ETL skills.
The workshop will focus on the following data sources
- RDBMS databases
- NoSQL databases
We start to learn for basic I/O files and directories. We can copy and delete files or directories. Next, we explore how to access various file types such as Text, CSV, JSON, and XML. In addition, we access remote data source over website and server-based S3 protocol.
We learn how to work with RDBMS database with Python. We use RBDMS database engines such as SQLite, MySQL, SQL Server and PostgreSQL. We perform CRUD (Create, Read, Update, Delete). We also access database table from Python Pandas. Then, we can convert Python Pandas Dataframe into database table.
We can leverage ETL with NoSQL database engines. We will work with MongoDB, Redis and Apache Cassandra. We perform CRUD (Create, Read, Update, Delete) on these NoSQL database engines. We also access NoSQL database from Python Pandas. Then, we can convert Python Pandas Dataframe into NoSQL database.
Last, we implement ETL Python program. We have three case studies to show how ETL work with Python.
This workshop needs a basic Python programming to follow all hands-on-labs. Internet access is needed when we’re installing additional Python libraries.
Who this course is for:
- Student and professional developers
- Any developer who wants to learn Python and database
- Any developer who wants to learn ETL with Database
- Having a basic knowledge of Python programming
- A computer with internet accesses
Last Updated 4/2021