Image Classification¶
Image → Class
Why is it hard?¶
- Semantic gap between input and output
- Viewpoint variation
- Translational
- Rotational
- Illumination variation
- Deformation of object
- Occlusion: object partially hidden
- Background clutter
- Intraclass variation
- Textural variation
Models¶
| Disadvantage | Robust to variance | ||
|---|---|---|---|
| \(k\) Nearest neighbor | L1 L2 distance of pixels | Inference speed proportional to train size | ❌ | 
| Linear | ❌ | ||
| FNN | ❌ | ||
| CNNs | ✅ | 
Pre-Processing¶
- Resize images to the same size
- Does Greyscale work better???
- Greyscale worsens linear classifier because it can no longer extract colors; linear classifier cannot extract textures well regardless anyways
- Normalize- Subtract mean image or
- Subtract per channel mean