Hyper-Parameter Tuning¶
Sometimes better to do in log-space, rather than linear space
| Advantage | Disadvantage | ||
|---|---|---|---|
| Manual | Time-Consuming | ||
| Grid Search | Computationally-expensive | ||
| Random Search | Non-deterministic | ||
| Evolutionary | Randomization, Natural Selection, Mutation | ||
| Bayesian | Probabilistic model of relationship b/w cost function and hyper-parameters, using information gathered from trials | ||
| Gradient-Based | Treat hyper parameter tuning like parameter fitting | ||
| Early-Stopping | Focus resources on settings that look promising eg: Successive Halving |
Speed Up¶
- Parallelizing
- Caching
- Random sampling: Wonβt work with caching

Clustering¶
Elbow Method¶
Plot cost function as function of no of clusters

Visualization¶
| Visualization | More than 3 hyperparameters | Simple | |
|---|---|---|---|
| Contour | β | β | |
| Parallel Coordinates | ![]() | β | β |
