Duration 5 days – 35 hrs
Overview
This intensive, hands-on course is tailored for data professionals aiming to strengthen their expertise in advanced data science and machine learning (ML) techniques. Participants will gain in-depth knowledge of feature engineering, model selection, optimization strategies, ensemble methods, and deep learning applications relevant to real-world projects. By the end of the course, learners will have built, fine-tuned, interpreted, and prepared machine learning models for production using industry-standard tools such as Scikit-Learn, XGBoost, TensorFlow/Keras, as well as interpretability libraries like SHAP and LIME.
Objectives
- Apply advanced data preprocessing and feature engineering techniques
- Build and evaluate a variety of supervised and unsupervised ML models
- Implement ensemble learning methods including bagging, boosting, and stacking
- Design and train deep learning models using TensorFlow or Keras
- Perform hyperparameter tuning using GridSearchCV, RandomizedSearchCV, and Bayesian Optimization
- Leverage interpretability tools to explain model behavior and predictions
- Prepare and deploy machine learning models into production environments
Audience
- Proficiency in Python programming
- Basic understanding of ML concepts (e.g., regression, classification)
- Familiarity with key Python libraries: Pandas, NumPy, Matplotlib, and Scikit-Learn
- A working knowledge of statistics, linear algebra, and probability
Prerequisites
- Familiarity with AI concepts (optional but helpful).
- Basic understanding of application development principles.
- Fundamental Programming experience is required
Course Content
Day 1: Advanced Feature Engineering and ML Pipelines
- Data preprocessing techniques: scaling, encoding, handling missing values
- Feature engineering and selection methods
- Dimensionality reduction: PCA, t-SNE, and UMAP
- Building reusable ML pipelines with Scikit-Learn
- Managing imbalanced data: SMOTE, class weighting
Day 2: Core Supervised and Unsupervised Modeling
- Evaluation metrics: ROC AUC, F1 score, and more
- Advanced regression and classification algorithms
- Cross-validation strategies: k-fold, stratified, time-series
- Unsupervised learning: clustering with KMeans, DBSCAN
- Hands-on Challenge: Model selection using real-world datasets
Day 3: Ensemble Learning and Model Optimization
- Bagging techniques and Random Forests
- Boosting methods: AdaBoost, XGBoost, LightGBM, CatBoost
- Stacking and blending strategies
- Hyperparameter tuning with GridSearchCV and RandomizedSearchCV
- Introduction to Bayesian Optimization with Optuna
Day 4: Deep Learning with TensorFlow/Keras
- Fundamentals of neural network architecture and training
- Understanding activation functions, loss functions, and optimizers
- Overview of CNNs and RNNs: concepts and use cases
- Model training and evaluation using TensorFlow/Keras
- Hands-on: Build a deep learning model for classification or regression
Day 5: Model Interpretability and Deployment
- Model explainability using SHAP and LIME
- Addressing fairness and bias in ML models
- Introduction to deployment: Flask, FastAPI, and Streamlit
- MLOps basics: version control, monitoring, and CI/CD workflows
- Capstone Project: Build, explain, and prepare a model for deployment