View on GitHub

Bearded-android-docs

Notation

Download this project as a .zip file Download this project as a tar.gz file

Created Monday 14 July 2014

`L` = the output layer.
`(bb x, bb t)` = training example
`N_j^((l))` = neuron `j` in layer `l`
`t_j` = the expected output of `N_j^((L))` when the network input is `bb x`
`o_j^((l))` = the actual output of `N_j^((l))` = impulse function value of `N_j^((l))` when the network input is `bb x`
`v_j^((l))` = the induced local field of `N_j^((l))` when the network input is `bb x`
`gamma_j^((l))` = the derivative of `N_j^((l))`s activation function `phi`.
In general, let superscript `l` denote layer `l`.
Let weight `w_(kj)^((l+1))` connect `N_k^((l+1))` to `N_j^((l))`

Layer l               Layer (l+1)
neuron 1              neuron 1
...                   ...
...                   ...
neuron j ----w_kj---> neuron k
...                   ...

Let `w_(k0)^((l)` be the bias

Note 1: Neuron `N_j^((l))` isn't necessarily fed input `bb x`. The network is fed input `bb x` and that in turn completely determines the inputs to neuron `N_j^((l))`.
Note 2: `t_j` is the expected network output, that is, the output of the neuron j in the output layer.
Note 3: Recall that there are two outputs associated with a neuron at input `bb y` --- the induced local field `bb w * bb y` and the value of its impulse function `phi(bb w * bb y)`.
Note 4: `gamma_k` is a function. `v_j` and `o_j` are real numbers.
Note 5: since `w_(k0)^((l))` is the bias, it's not really connected to a neuron in the previous layer (but see below).

All Layers Have an Input Layer

Given input `bb x`, define `bb y^((l))` as the input that that's fed to the next layer `(l+1)`:

For `l = 0` (input layer), let `y_j^((0)) = x_j` (the "output" of the input layer is just the input).
For `l > 0`, let `y_j^((l)) = o_j^((l))` (the output of neuron `N_j` in layer `l`).
For all `l`, let `y_0^((l)) = +1` (map `y_0` to the bias).

Note: Since `w_(j0)` is always the bias, we can write the output of neuron `N_j^((l+1))` in terms of `y^((l))`:

`\ o_j^((l+1)) = phi(bb y^((l)) * bb w) = phi(sum_k y_k^((l)) w_(jk))`

where the sum `sum_k` is over the neurons in the previous layer.

More Than One Training Example

Up to this point, the notation doesn't take into account multiple training examples. To associate a term with the ith training example `{bb x[i], bb t[i]}`, use square brackets. For example,

`o_j^((l))[i]`= the output (value of the impulse function) of neuron `N_j^((l))` when the network is fed input `bb x[i]`.
`delta_j^((l))[i]` = the gradient of neuron `N_j^((l))` when the network is fed input `bb x[i]`.
`y_j^((l))[i]` = the output of neuron `N_j^((l))` when the network is fed input `bb x[i]`

No backlinks to this page.

comments powered by Disqus