Basic Concepts in ML
Statistics for ML
- Statistical Reasoning
- Curse of Dimensionality
- Frequentist vs. Bayesian Probability
- Probability Distributions
- Central Limit Theorem vs. Law of Large Numbers
- Assessing skewed data
- Categorical vs. continuous variables
- Statistical Tests
- Hypothesis Test
- p-value; significance level; statistical power; confidence interval
- t-test
- ANOVA
- Chi-square
ML Models
- Parametric/Non-parametric
- Supervised/Unsupervised
- Supervised
- Loss Function
- Regularization (what is it and why is it useful?)
- Underfitting/Overfitting
- Bias vs. Variance Tradeoff (f-hat and flexibility of model)
- Datasets – Train, test
- K-Fold Cross Validation
- Variable transformations
- Exploratory Data Analysis – checks and interpretation
- Histograms, scatterplots, correlation matrix, numerical summary
- Bagging / Boosting / Stacking
Simple / Multiple Regression
- Collinearity
- Model assumptions – ex. residuals
- Interpret Model Output / Model Performance Measures
- R^2, adjusted R^2
- p-value for each term
- Coefficients
- p-value for model
- Residuals
- Outliers
- F-statistic
- Accuracy (MSE, RMSE)
Classification
- Imbalanced classes
- Undersampling/Oversampling
- Decision trees
- Support vs. Confidence
- Model Performance Measures
- Contingency table
- Confusion Matrix - errors
- ROC Curve
- Precision vs. Recall
- TPR/FPR
- Lift
- AUC
- Accuracy calculation: = (TP+TN)/(TP+TN+FP+FN)
- Models - understand how they work and how output of each looks visually
- K-Nearest Neighbor
- Logistic Regression
- Random Forest
- Support Vector Machine/Classifier – kernel
- Naïve Bayes
- Neural Network
Model Features/Selection
- Wrapper vs. filter method
- Feature Selection
- Feature Creation
- Stepwise Regression (what is difference between forward and backward?)
- Dimension Reduction
- PCA
Added
- One-hot-encoding
- Mean squared error: MSE
- Cross-entropy function)
- Loss Function
- Cost Function
- Regularization
- L1 Regularization (Lasso)
- L2 Regularization (Ridge)
- Normalization
- Scaling
- AutoEncoder
- Gradient Descent