Duration 3 days – 21 hrs
Overview
This course provides a practical, hands-on approach to building and evaluating machine learning models using real-world datasets. Participants will dive deeper into supervised learning techniques, including Decision Trees, Support Vector Machines (SVM), K-Nearest Neighbors (KNN), and Naive Bayes classifiers. The course also emphasizes model evaluation, optimization strategies, cross-validation, and techniques for handling overfitting and underfitting.
Objectives
- Implement and compare key supervised ML algorithms: Decision Trees, SVM, KNN, and Naive Bayes
- Evaluate model performance using accuracy, precision, recall, F1-score, and confusion matrices
- Apply cross-validation and tuning techniques for better generalization
- Identify and address overfitting and underfitting in machine learning models
- Use Python (scikit-learn) to build optimized ML pipelines on structured datasets
Audience
- Data analysts, junior data scientists, and AI enthusiasts with basic ML experience
- Professionals who completed an introductory ML course and want to advance their skills
- Developers and technical leads applying ML models in real-world projects
- Students and researchers seeking practical modeling and evaluation experience
Prerequisites
- Completion of an introductory machine learning course
- Proficiency in Python and foundational libraries (NumPy, Pandas, scikit-learn)
- Familiarity with basic ML concepts: regression, classification, and model training
Course Content
Day 1: Supervised ML Algorithms in Action
- Decision Trees: concept, splitting criteria, advantages & limitations
- SVM (Support Vector Machines): linear vs. nonlinear classification, kernels
- K-Nearest Neighbors (KNN): distance metrics, selecting K
- Naive Bayes: probability, assumptions, and text classification use case
- Hands-on: Implementing and comparing classifiers on a sample dataset
Day 2: Model Evaluation Techniques
- Evaluation metrics: Accuracy, Precision, Recall, F1-score, ROC-AUC
- Confusion matrix interpretation and reporting
- Cross-validation: k-fold, stratified sampling, holdout method
- Hands-on: Evaluate and visualize model performance using scikit-learn
Day 3: Optimization, Overfitting & Practical ML Pipeline
- Underfitting vs. Overfitting: causes, detection, and solutions
- Hyperparameter tuning with GridSearchCV and RandomizedSearchCV
- Building a modular ML pipeline in scikit-learn
- Final mini project: Select an algorithm, apply cross-validation, tune parameters, and report performance on a real dataset


