Created Monday 09 December 2013
Super-Condensed Summary
Pros:
- good at recognizing invariant characteristics
- reduce the number of free parameters significantly while using a biology-inspired structure to solve complex problems
Cons:
- need large training examples???
Overview
Convolutional networks or CNNs combine three ideas:
- local receptive fields (inspired by the feline visual system!)
- shared weights
- spatial or temporal subsampling
The result is shift and distortion invariance.
- Local receptive fields extract the topology of the input---elementary visual features such as oriented edges, end-points, etc.
- Higher layers then combine these features.
- Subsampling makes the network more robust to slight distortions and translations.
- Weight sharing allows knowledge learned in one feature detector to be used across other parts of the image. It also reduces the number of free parameters, increasing the network's ability to generalize.
Reasons for Using a CNN
- invariance for translations or distortions in inputs
- a fully connected neural network requires 100,000s of weights while a CNN uses much less.
- fully connected neural network ignores the topology of the input. Nearby pixels have high correlation. CNN networks force the extraction of local features by using a local receptive field.
Implementation
See +Implementation
Learning With Backprop
The standard algorithm must be slighthly modified to take account of the weight sharing. An easy way to implement it is to first compute the partial derivatives of the loss function with respect to each connection (as if the network were conventional multi layer). Then the partial derivatives of all the connections that share the same parameter are added to form the derivative w/ respect to that parameter.