Content

7 chapters ・

39 steps・

15h 44m

5 steps・ 1h 21m

In this chapter we are going to learn more about the definition of data engineering and what that really takes into account. You will learn the tasks and core competencies of data engineers and we will introduce you to a few key concepts like ETL

This chapter is fully focused on the tools you can use with Python to manage large datasets of data. You will learn how to use Jupyter notebooks, Numpy and Pandas.

In this chapter, we dive deep into data extraction using various techniques. We will use Python and SQL in a variety of environments and use cases.

24m

Exploratory Data Analysis is an approach to analyzing datasets that involves summarizing their main characteristics, often using visual methods. It’s a critical first step in data analysis, allowing analysts to discover patterns and more

In this chapter, you will learn how to clean up data before working on it. This is usually related to the removal of columns, information, and other data that is not needed.

Now that you have cleaned your data it’s time to visualize them and it’s chapter is all about learning how to use a few Python libraries to easily visualize data and build any type of chart.

5 steps・ 5h 24m

A data pipeline automates the flow of data from sources to storage and analysis. It is the final result of everything we have seen in this path: It involves extraction, transformation, and loading (ETL) processes to prepare and move data efficiently.

Test your Data Engineering skills with our AI Simulation

About this path

Embark on the Python Data Engineering skill path, tailored for software engineers aiming to master data management. This course guides you from the fundamentals of Python to advanced techniques in data manipulation and management. You'll learn to efficiently process and analyze large datasets using pandas and NumPy, and visualize your insights with Matplotlib. The curriculum covers all phases of data treatment—from initial data wrangling to the final presentation of your findings. Ideal for software engineers seeking to enhance their data engineering skills and work effectively with data at scale.

Similar content at Anthropos