Duration 2 days – 14 hrs
Overview
This course introduces learners to the fundamental skills of working with datasets, cleaning and transforming data, and performing exploratory data analysis (EDA). Participants will gain hands-on experience in preparing raw data for AI and machine learning models, ensuring high-quality input for better outcomes.
Objectives
- Understand the structure and types of datasets.
- Apply techniques for data cleaning and transformation.
- Visualize data to uncover patterns and insights.
- Perform basic exploratory data analysis (EDA) to inform decision-making.
- Prepare datasets for machine learning model training.
Audience
- Basic knowledge of Python programming (variables, data types, loops).
- Completion of a basic Python course (recommended but not required).
Prerequisites
- Basic algebra (addition, multiplication, simple equations).
- No advanced mathematics or programming knowledge required.
Course Content
Day 1: Understanding and Preparing Data
- Introduction to datasets: structure, formats (CSV, JSON, Excel)
- Loading and inspecting datasets using Python (Pandas)
- Data cleaning fundamentals:
- Handling missing values
- Removing duplicates
- Data type conversions
- Data transformation basics:
- Normalization and standardization
- Encoding categorical variables
Day 2: Visualization and Exploratory Data Analysis (EDA)
- Introduction to data visualization (Matplotlib, Seaborn)
- Creating basic charts: histograms, bar plots, scatter plots
- Identifying patterns, correlations, and outliers
- Introduction to summary statistics (mean, median, mode, variance)
- Basic EDA workflow:
- Formulating questions
- Visual storytelling with data
- Preparing datasets for machine learning
Final Hands-On Activity:
- Mini project: Clean, transform, and perform EDA on a sample real-world dataset.



