These are a few notes I did by solving a few exercises. At the moment there are 4 of them: EDA, ETL, Pandas and Visualization. A brief description of each is given below.
I decided to complement each of them from several sources and save them as some sort of ‘cheatsheets’. Here’s what you will find on each folder:
- EDA - A basic EDA exercise which uses an old Kaggle car dataset. It explains and follows the ‘standard’ exploratory process.
- ETL - The notes focus on how to connect Python to MySQL by using the pymsql module. Also, I briefly explained how I did the cleaning process. I used aegorenkov’s ‘drinks’ dataset.
- Pandas - This notebook contains Panda’s most used commands. At the very beginning it followed the BI lectures (from 365 Careers course), but it got complemented from many other sources. It uses the absenteeism dataset.
- Visualization - Basic notes from a Matplot and Seaborn tutorial.