Created Monday 14 July 2014
For each weight `w_(kj)^((l))`, calculate the cumulative learning term:
`\ \ Delta w_(kj)^((l)) = eta sum_i Delta w_(kj)^((l))[i] = eta sum_i delta_k^((l))[i] y_j^((l-1))[i]`
Use this cumulative sum to adjust the weight for each iteration of the algorithm.
Note:
- Do NOT adjust the weight until you calculated the total sum `sum_i Delta w_(kj)^((l))[i]` from ALL the training examples.
- Its now clear why we needed to store the values of the impulse functions and the gradients---calculating these values dynamically significantly reduces the performance of the algorithm.