# Backpropagation in Practice

## Implementation Note: Unrolling Parameters : article

• with neural networks, we are working with sets of matrics:
``````  (1)   (2)   (3)
Θ   , Θ   , Θ   ...

(1)   (2)   (3)
D   , D   , D   ...
``````

### Octave Example

• Theta1
``````Theta1 = ones(10, 11)
Theta1 =

1   1   1   1   1   1   1   1   1   1   1
1   1   1   1   1   1   1   1   1   1   1
1   1   1   1   1   1   1   1   1   1   1
1   1   1   1   1   1   1   1   1   1   1
1   1   1   1   1   1   1   1   1   1   1
1   1   1   1   1   1   1   1   1   1   1
1   1   1   1   1   1   1   1   1   1   1
1   1   1   1   1   1   1   1   1   1   1
1   1   1   1   1   1   1   1   1   1   1
1   1   1   1   1   1   1   1   1   1   1
``````
• Theta2
``````Theta2 = 2*ones(10, 11)
Theta2 =

2   2   2   2   2   2   2   2   2   2   2
2   2   2   2   2   2   2   2   2   2   2
2   2   2   2   2   2   2   2   2   2   2
2   2   2   2   2   2   2   2   2   2   2
2   2   2   2   2   2   2   2   2   2   2
2   2   2   2   2   2   2   2   2   2   2
2   2   2   2   2   2   2   2   2   2   2
2   2   2   2   2   2   2   2   2   2   2
2   2   2   2   2   2   2   2   2   2   2
2   2   2   2   2   2   2   2   2   2   2
``````
• Theta3
``````Theta3 = 3*ones(1, 11)
Theta3 =

3   3   3   3   3   3   3   3   3   3   3
``````
• thetaVec
``````thetaVec = [ Theta1(:); Theta2(:); Theta3(:) ]
thetaVec =
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
.
.
.
.
.
2
2
2
2
2
2
.
.
.
.
3
3
3
3

>> size(thetaVec)
ans =
231  1
``````
• reshape
`````` reshape(thetaVec(1:110), 10, 11)
ans =

1   1   1   1   1   1   1   1   1   1   1
1   1   1   1   1   1   1   1   1   1   1
1   1   1   1   1   1   1   1   1   1   1
1   1   1   1   1   1   1   1   1   1   1
1   1   1   1   1   1   1   1   1   1   1
1   1   1   1   1   1   1   1   1   1   1
1   1   1   1   1   1   1   1   1   1   1
1   1   1   1   1   1   1   1   1   1   1
1   1   1   1   1   1   1   1   1   1   1
1   1   1   1   1   1   1   1   1   1   1

% reshape %

reshape(thetaVec(111:220), 10, 11)
ans =

2   2   2   2   2   2   2   2   2   2   2
2   2   2   2   2   2   2   2   2   2   2
2   2   2   2   2   2   2   2   2   2   2
2   2   2   2   2   2   2   2   2   2   2
2   2   2   2   2   2   2   2   2   2   2
2   2   2   2   2   2   2   2   2   2   2
2   2   2   2   2   2   2   2   2   2   2
2   2   2   2   2   2   2   2   2   2   2
2   2   2   2   2   2   2   2   2   2   2
2   2   2   2   2   2   2   2   2   2   2

% reshape %

reshape(thetaVec(221:231), 1, 11)
ans =

3   3   3   3   3   3   3   3   3   3   3

``````
• Summary

Note:

### Octave example

``````for i = 1:n,
thetaPlus = theta;
thetaPlus(i) = thetaPlus(i) + EPSILON; % ϵ (EPSILON) %
thetaMinus = theta;
thetaMinus(i) = thetaMinus(i) - EPSILON;
gradApprox(i) = (J(thetaPlus) - J(thetaMinus)) / (2*EPSILON);
end;
``````
• We previously saw how to calculate the deltaVector. So once we compute our gradApprox vector, we can check that gradApprox ≈ deltaVector.

## Random Initialzation : article

• zero initialization —> GG 不優

• Random initialization: Symmetry breaking –> Good! 棒棒～

