Skip to content

Learning Algorithms¶

Algorithm	Learning Type	Task	Comment	Assumptions	Pre-Processing	Probabilistic	Parametric	Scope	\(d_\text{VC}\)	Bias	Variance	Generalization	Advantages	Disadvantages	Time Complexity Training	Time Complexity Inference	Space Complexity Training	Space Complexity Inference
Least Squares	Supervised	Regression		Mean has linear relationship with inputs Constant variance		❌	✅	Global	\(k+1\)	High	Low	Good \(n >> k\)			\(O(n_\text{train} p^2 + p^3)\)	\(O(n_\text{inf} p)\)
LogisticRegression	Supervised	Classification Binary				✅	✅	Global	\(k+1\)	High	Low	Good \(n >> k\)			\(O(n_\text{train} p)\)	\(O(n_\text{inf} p)\)
Naive Bayes	Supervised														\(O(n_\text{train} p)\)	\(O(n_\text{inf} p)\)
OrderedRegression	Supervised	Classification Ordered
Piecewise Constant	Supervised	Regression/ Classification				❌	❌	Local
Piecewise Polynomial	Supervised	Regression/ Classification				❌	❌	Local
SVM	Supervised	Regression/ Classification	Margin-Based			❌	✅							Computationally-expensive
Gaussian Processes	Supervised	Regression/ Classification				✅	✅
KNN Nearest Neighbor	Supervised	Regression/ Classification	Good baseline model Can use mean, median, mode for regression Can use weightage, voting for classification	Underlying relationship b/w \(x\) and \(y\) is continuous	- Feature scaling - Dimensionality reduction	❌	❌						+ Lazy-learner: no train time, new train data can be added without re-training + Can perform multi-class out-of-the-box	Lazy-learner: High space complexity, high inference time - Cannot use categorical features - Does not work well for large \(n\) or \(p\): need to calculate the distance of inference record wrt training records - Sensitive to noise - Sensitive to missing data - Sensitive to scale - Cannot extrapolate
Decision Tree	Supervised	Regression/ Classification	Automatic Piecewise Constant Exactly opposite in characteristics wrt to OLS			❌	❌	Local		Low	High		- Highly-interpretable - Auto-detect non-linear relationships - Auto-model variable interactions - Fast evaluation: Traversal only occurs on subset of attributes	- Poor regressive performance - Unstable: Tree struct sensitive to train data; changing train data changes tree - Require large no of splits for even simple relationships - Cannot extrapolate
Linear Tree	Supervised	Regression/ Classification	Automatic Piecewise Polynomial			❌	❌	Local					Can extrapolate
Linear Forest	Supervised	Regression/ Classification
Linear Boosting	Supervised	Regression/ Classification
Random Forest	Supervised	Regression/ Classification	Bagged Trees			❌	❌	Local						- Cannot extrapolate
XGBoost	Supervised	Regression/ Classification	Boosted Trees			❌	❌	Local						- Cannot extrapolate
CatBoost	Supervised	Regression/ Classification	Boosted Trees			❌	❌	Local						- Cannot extrapolate
LightGBM	Supervised	Regression/ Classification	Boosted Trees			❌	❌	Local						- Cannot extrapolate
Random Forest Proximity	Supervised	Missing value imputation Anomaly Detection Clustering
K-Means K-Medoids	Unsupervised	Clustering		Euclidean distance assumes spherical clusters		❌	❌
Gaussian Mixtures	Unsupervised	Clustering
Hierarchical Clustering	Unsupervised	Clustering
One-Many Clustering	Unsupervised	Clustering
Graph Clustering	Unsupervised	Clustering
KNN	Unsupervised	Anomaly Detection
Kernel Density Estimation	Unsupervised	Anomaly Detection				✅
Isolation Forest	Unsupervised	Anomaly Detection	As contamination will mostly be unknown Get score without contamination offset (normalize for easier interpretation) Visualization 1. rank by the score 2. ⁠plot histogram of scores 3. parallel coordinates			❌								Requires pre-specified contamination fraction
One-Class SVM	Unsupervised	Anomaly Detection
Autoencoders	Unsupervised	Anomaly detection via reconstruction loss	Reconstruction loss will be high for anomalous events
Kernel PCA, UMAP	Unsupervised	Dimensionality reduction Anomaly detection via reconstruction loss
Cluster Analysis	Unsupervised	Anomaly Detection
IDK	Unsupervised	Ranking/Scoring	1. Robust Scaler for \(x_{ij}\) 2. \(x_{ij} - \min (x_{j}), \forall j\), to prevent unintended consequences in step 3 3. Convert into euclidean distance from origin 4. Clustering 5. Score - Discrete score: Larger distance of cluster center from origin - Continuous score: Not sure yet
Q-Learning	Re-Inforcement Learning					❌	❌

Curse of Dimensionality¶

As the no of dimensions increases, relative distances tend to 0

Distance-based models are the most affected

KNN
K-Means
Tree-based classification
SVM?

\[ \begin{aligned} \dfrac{d_a}{d_b} &= \dfrac{ \sqrt{\sum_j^k d^2_{a, j}} }{ \sqrt{\sum_j^k d^2_{b, j}} } \\ \text{When } d \to \infty, \dfrac{d_a}{d_b} &\approx \dfrac{ \sqrt{d^2_{a, \text{nominal}} + n^2k} }{ \sqrt{d^2_{b, \text{nominal}} + n^2k} } \\ &\approx 1 \end{aligned} \]

2025-07-26

Comments