Image Captioning¶

Rather than instructing the RNN to sample text at random, we are conditioning that sampling by the output of the CNN
Forward Pass¶



Backward pass¶
- If you start with pre-trained CNN, only backprop for the RNN
- Else, backprop through the RNN and CNN