MACHINE LEARNING WITH PYTHON
Ever since the dawn of data and computational power machine learning have been proven to be a very good tool to predict, analyse, segment in every domain. From high end physics research to supermarket, data is available everywhere and we should be able to utilize the data. That is why the skills set is very vital and comes in handy in every aspect of our lives.
The main objective is to introduce people into the world of machine learning and familiarize with the tools and techniques of the machine learning world. The course also plans to make students familiar with the Machine Learning pipeline followed by industry thus making students ready to be involved in an end to end machine learning project in the real world.
It will be make students ready to tackle real world problem with the skill set utilizing data and then thus able to profit from
At the end of the course students will be able to construct a whole machine learning pipeline following the life cycle of any machine learning project. He/she will be able to construct model, evaluate them using different metrics relative to the problem, perform data analysis to get powerful insights into the data and be able to tell a story to client.
In addition to that students will be able to perform supervised and unsupervised learning, dimension reduction, hypothesis testing and compare the model’s performance in real world simulation.
Also many many more technical and geeky aspects of this new field is covered which is included in the detail syllabus of the course.
WHO CAN JOIN MACHINE LEARNING WITH PYTHON?
If you have some knowledge of programming (i.e. very basic stuff) and surficial knowledge of matrices and calculus, you are good to go. Stating that if you have a will to learn it will be very easy to be able to catch with the concepts. Also our course have been designed to have a very soft learning curve so everyone interested is welcome.
MODULE 1: FUNDAMENTALS OF PYTHON
1.1 Installation of Python and environment for the dependencies
1.2 Data types (integers, floats, strings, list, tuple, dictionary, multi dimensional lists)
1.3 Data types operations (pop, push, append, insert, del)
1.3 Loops (for, while)
1.4 Conditional statements (and, or, if, equals to, not equals to)
1.5 Functions and return types
1.6 Inbuilt functions useful in ML (len, columns, is null, value counts, )
1.6 Introduction to Pandas and numpy
1.7 Reading files from csv, database.
1.8 Data operations with pandas like merge, sort, concat, drop, copy.
1.9 Subsetting and data extraction from pandas
MODULE 2: DATA EXPLORATION
2.1 Correlations and causality
2.2 Introduction to matplotlib and sns
2.3 Extracting meaningful insights from data by visualization
2.4 Practice exercises in data exploration using popular datasets like titanic, etc.
MODULE 3: DATA PREPROCESSING
3.1 Description of Data Frames
3.2 Missing Value imputation
3.3 Converting to categorical data
3.4 Normalization and scaling of data
3.5 Processing time series data
3.6 Practice exercises
MODULE 4: INTRODUCTION TO MACHINE LEARNING
4.1 Linear Regression from scratch using Ordinary Least Square method
4.2 Introduction to Gradient Descent
4.3 Linear Regression from scratch using Gradient Descent Optimization
4.4 Introduction to Sklearn and use of it to implement Linear Regression
4.5 Logic behind Logistic Regression, Support Vector Machines, Decision Trees
4.6 Practice exercises using well known datasets to use for prediction
MODULE 5: BAYESIAN METHODS
5.1 Bayes Rule
5.2 Multinomial and Gaussian Naive Bayes
5.3 Implementation in spam email detection
5.4 Practice exercises
MODULE 6: EVALUATION METRICS
6.1 Accuracy for Classification
6.2 Confusion Metrics, Sensitivity/Recall, Specificity, F1 Score, ROC, AUC
6.3 Evaluation for Imbalanced Datasets
6.4 Evaluation for Regression Problem, L1 loss, L2 loss
6.5 Practice exercises
MODULE 7: PROBABILITY DISTRIBUTIONS AND HYPOTHESIS TESTING
7.1 Kernel Distribution Estimation plots
7.2 Setting up hypothesis and checking
7.3 Importance of p value and critical threshold
7.4 Practice exercises
MODULE 8: UNSUPERVISED LEARNING, CLUSTERING AND MODEL ENSEMBLING
8.1 K-nn algorithm and its implementation
8.2 K-means algorithm and its implementation
8.3 DBSCAN algorithm and its implementation
8.4 Bagging Boosting (XGBoost, AdaGrad)
8.5 Practice exercises
MODULE 9: BIAS/VARIANCE, RECALL/PRECISION TRADE-OFFS AND REGULARIZATION
9.1 Overfitting and underfitting a model
9.2 Regularization Techniques to prevent overfitting
9.3 Balance and requirements for Recall and Precision
9.4 L1 and L2 regularization
9.5 Practice exercises
MODULE 10: HYPER-PARAMETERS TUNING
10.1 What are hyper parameters
10.2 Grid Search vs Random Search
10.3 K-fold validation
10.4 Practice exercise
MODULE 11: FEATURE ENGINEERING
11.1 Extracting feature importance
11.2 Feature Selection using inbuilt python methods
11.3 Creation of new features and dropping irrelevant features
11.4 Practice exercises