Hyper-Parameter Tuning¶
Sometimes better to do in log-space, rather than linear space
| Type | Advantage | Disadvantage | |
|---|---|---|---|
| Manual | ??? | Time-Consuming | |
| Grid Search | Computationally-expensive | ||
| Random Search | Non-deterministic | ||
| Latin Hypercube Sampling | Similar to Random Search, but ensures that same regions don't get explored more than once | ||
| Evolutionary | Randomization, Natural Selection, Mutation | ||
| Bayesian | Probabilistic model of relationship b/w cost function and hyper-parameters, using information gathered from trials | ||
| Gradient-Based | Treat hyper parameter tuning like parameter fitting | ||
| Early-Stopping | Focus resources on settings that look promising eg: Successive Halving |
Speed Up¶
- Parallelizing
- Caching
- Random sampling: Wonβt work with caching

Clustering¶
Elbow Method¶
Plot cost function as function of no of clusters

Visualization¶
| Visualization | More than 3 hyperparameters | Simple | |
|---|---|---|---|
| Contour | β | β | |
| Parallel Coordinates | ![]() | β | β |
