# aiacademy: 深度學習 Deep Learning 手刻神經網路!

Tags:

### NN_BP_HANDCRAFT_2

• Derivative

• 先來簡單的方程式: `````` # Define our target function
def my_function(x):
result = 0.05*x**2 + 0.8*x
return result
``````
• 定義微分公式: `````` # Define derivative
epsilon = 0.1
def derivative(f, x):
# f in here stands for our function, and x is our input
# I usually set a variable called epsilon, it represents a small number
h = epsilon
result = (f(x + h) - f(x)) / h
return result
``````
• Partial Derivative

• 再來定義定一個方程式: `````` def my_function2(X):
# For me, I tend to use uppercase X to represent a list or a matrix
# and use lowercase x to represent a single value
# You can change the variable to whatever that you feel natural
# just remember that the X here represent a list of two variables x and y
# in which X represents x and X represents y
result = 2*X**2 + 3*X*X + 5*X**2
return result
``````
• 定義篇微分方程式: `````` def partial_derivative(f, X, i):
# f is our function, and i is simply the index which we are
# excuting our partial derivative on
H = X.copy()
h = epsilon
H[i] = X[i] + h
result = (f(H) - f(X)) /h
return result
"""
print('The partial with respect to x at (2, 3) is',partial_derivative(my_function2,    np.array([2., 3.]), 0))
print('The partial with respect to y at (2, 3) is',partial_derivative(my_function2,    np.array([2., 3.]), 1))
The partial with respect to x at (2, 3) is 17.000002003442205
The partial with respect to y at (2, 3) is 36.000004996594726
"""
``````

### NN_BP_HANDCRAFT_3 `````` def gradient(f, X):
for i in range(len(X)):
``````

### NN_BP_HANDCRAFT_4

• Loss

• MSE 方程式 `````` def mse(actual, pred):
MSE = 0
for i in range(len(actual)):
MSE += (actual[i]-pred[i])**2
return MSE / len(actual)
``````

### NN_BP_HANDCRAFT_5

• Model Building

• 先來 import 好棒棒 iris 花花
`````` from sklearn import datasets
X = iris.data
Y = iris.target
Y = Y.reshape(len(Y), 1)
names = iris.target_names
# Train valid test split

X_train = np.vstack([X[0:40], X[50:90], X[100:140]])
X_valid = np.vstack([X[40:45], X[90:95], X[140:145]])
X_test = np.vstack([X[45:50], X[95:100], X[145:150]])

Y_train = np.vstack([Y[0:40], Y[50:90], Y[100:140]])
Y_valid = np.vstack([Y[40:45], Y[90:95], Y[140:145]])
Y_test = np.vstack([Y[45:50], Y[95:100], Y[145:150]])
``````
• 定義 function

• • 這邊可以看我的筆記，大神介紹 cost function: J(θ0, θ1, …) 是不是長得很像阿~~哈哈　就是一樣的拉！
大神，在這邊有說到，”in neural networks, in the neural network literature sometimes you might hear people talk about weights of a model and weights just means exactly the same thing as parameters of a model.” 意思就是W == θ
`````` def function_(W, data, target):
z = np.matmul(data, W)
f = mse(target, z) / 2
return f
``````

• • 我認為精華就在這張式子了
`````` def gradient_descent(X_train, Y_train, X_valid, Y_valid, W, alpha, num_iters):
m = len(Y_train)
train_loss = np.zeros((num_iters, 1))
valid_loss = np.zeros((num_iters, 1))

for i in range(num_iters):
A = np.matmul(X_train, W)
delta = np.sum((A-Y_train)*X_train/m , axis=0)
W -= alpha*delta.reshape(len(delta), 1)

train_loss[i] = mse(Y_train, np.matmul(X_train, W))
valid_loss[i] = mse(Y_valid, np.matmul(X_valid, W))

if i % 10 == 0:
print("The training loss of the {} epoch is {}, and the validation loss of the {} epoch is {}"
.format(i+1, train_loss[i].round(4), i+1, valid_loss[i].round(4)))

return W, train_loss, valid_loss
``````
• 結果：

`````` if __name__ == "__main__":
# Initializing
np.random.seed(37)
W = np.random.random((4, 1))
W, train_loss, valid_loss = gradient_descent(X_train, Y_train, X_valid, Y_valid, W, 0.03, 100)
predict = np.matmul(X_test, W).round()
print('The MSE score of our prediction is ', mse(Y_test, predict))

"""
The training loss of the 1 epoch is 42.1148, and the validation loss of the 1 epoch is 39.4721
The training loss of the 11 epoch is 2.715, and the validation loss of the 11 epoch is 2.4563
The training loss of the 21 epoch is 0.2498, and the validation loss of the 21 epoch is 0.1886
The training loss of the 31 epoch is 0.0764, and the validation loss of the 31 epoch is 0.0403
The training loss of the 41 epoch is 0.0593, and the validation loss of the 41 epoch is 0.027
The training loss of the 51 epoch is 0.0565, and the validation loss of the 51 epoch is 0.0244
The training loss of the 61 epoch is 0.0557, and the validation loss of the 61 epoch is 0.0234
The training loss of the 71 epoch is 0.0554, and the validation loss of the 71 epoch is 0.0229
The training loss of the 81 epoch is 0.0553, and the validation loss of the 81 epoch is 0.0227
The training loss of the 91 epoch is 0.0551, and the validation loss of the 91 epoch is 0.0225
The MSE score of our prediction is  0.0
"""
``````

### NN_BP_HANDCRAFT_6 (BackPropagation)

• Sigmoid

• Sigmoid 方程式 `````` def sigmoid(X):
output = 1 / ( 1 + np.exp(-X) )
return output
``````
`````` # Examples
X = np.arange(5)
sigmoid(X)
# array([0.5 , 0.73105858, 0.88079708, 0.95257413, 0.98201379])
`````` `````` def sigmoid_gradient(X):
output = sigmoid(X) * (1 - sigmoid(X))
return output
``````
`````` # Examples
X = np.arange(5)
# array([0.25 , 0.19661193, 0.10499359, 0.04517666, 0.01766271])
``````
• Softmax:

• Softmax & Sigmoid

• sigmoid 至直接針對單一數值做壓縮

• softmax 至針對一整組數字做壓縮

• EX:

`````` 如果我們有一組數列  1,3,5
Softmax 會回傳  0.016,0.117,0.867
``````
• Softmax 式子 • 看不懂 keepdims在寫沙姣?

`````` def softmax(X):
# Something to keeep in mind is that
# np.sum can control the direction we are summing over
# which is why it is superior than the regular sum from python
# but when the shape of a matrix is (n, ) instead of (n,1)
# broadcasting can act in unexpected ways
# you can use keepdims (function within np.sum) to make sure the shape is right
return np.exp(X) / np.sum(np.exp(X), axis=1, keepdims=True)
``````
• Cross entropy (Multiclass)

• Definition: Cross-entropy loss, or log loss, measures the performance of a classification model whose output is a probability value between 0 and 1. Cross-entropy loss increases as the predicted probability diverges from the actual label. So predicting a probability of .012 when the actual observation label is 1 would be bad and result in a high loss value. A perfect model would have a log loss of 0. `````` def cross_entropy(p, q):
# As you probably notice
# log(0) = negative infinity
# to avoid this we usually add a epsilon to our calculation
# and in sklearn they like to use 1e-15
# so thats what we'll be using here
epsilon = 1e-15
H = 0
for i in range(len(p)):
H += - p[i] * np.log(q[i]+epsilon)
output = H.sum() / p.shape
return output
``````
`````` # Example
p = np.array([[1,0,0]])
q = np.array([[0.7, 0.2, 0.1]])
cross_entropy(p,q)
# 0.356674943938731
``````
• One Hot Encoding

• 手刻 one hot encoding!!!

`````` def one_hot_encoding(array):

sorted_array = np.sort(array)
count = 1
unique = [sorted_array]
temp = sorted_array
for i in range(len(array)):
if sorted_array[i] != temp:
count += 1
temp = sorted_array[i]
unique.append(temp)
len_ = len(unique)
eye = np.zeros((len_, len_))
print('len_:', len_)
print('unique', unique)
for i in range(len_):
eye[i, i] = 1
print('eye: ', eye)
for i in range(len(array)):
for j in range(len(unique)):
if array[i] == unique[j]:
array[i] = j
print('changing array', array)
result = eye[array]
return result
``````

example

`````` if __name__ == '__main__':
array = ['John', 'Tim', 'Tom', 'John', 'Marry', 'Tom']
one_hot = one_hot_encoding(array)
print(one_hot)

# len_: 4
# unique ['John', 'Marry', 'Tim', 'Tom']
# eye:  [[1. 0. 0. 0.]
#  [0. 1. 0. 0.]
#  [0. 0. 1. 0.]
#  [0. 0. 0. 1.]]
# changing array [0, 'Tim', 'Tom', 'John', 'Marry', 'Tom']
# changing array [0, 2, 'Tom', 'John', 'Marry', 'Tom']
# changing array [0, 2, 3, 'John', 'Marry', 'Tom']
# changing array [0, 2, 3, 0, 'Marry', 'Tom']
# changing array [0, 2, 3, 0, 1, 'Tom']
# changing array [0, 2, 3, 0, 1, 3]
# [[1. 0. 0. 0.]
#  [0. 0. 1. 0.]
#  [0. 0. 0. 1.]
#  [1. 0. 0. 0.]
#  [0. 1. 0. 0.]
#  [0. 0. 0. 1.]]
``````

### NN_BP_HANDCRAFT_7 (BackPropagation)

• Neural Network • working flow chart `````` def two_layer_net(X, Y, W1, W2):
# Forward
z1 = np.matmul(X, W1)
a1 = sigmoid(z1)
z2 = np.matmul(a1, W2)
out = softmax(z2)
J = cross_entropy(Y, out)
# Backward
d2 = out - Y
dW2 = np.matmul(a1.T, d2)
dW1 = np.matmul(X.T, d1)

return J, dW1, dW2
``````
• example
``````  iris = datasets.load_iris()
X = iris.data
Y = iris.target
Y = one_hot_encoding(Y)
names = iris.target_names

X_train = np.vstack([X[0:40], X[50:90], X[100:140]])
X_valid = np.vstack([X[40:45], X[90:95], X[140:145]])
X_test = np.vstack([X[45:50], X[95:100], X[145:150]])

Y_train = np.vstack([Y[0:40], Y[50:90], Y[100:140]])
Y_valid = np.vstack([Y[40:45], Y[90:95], Y[140:145]])
Y_test = np.vstack([Y[45:50], Y[95:100], Y[145:150]])

# Neural Network
iteration = 1000
alpha = 0.01
history_train = np.zeros((iteration, 1))
history_valid = np.zeros((iteration, 1))

np.random.seed(37)
W1 = np.random.randn(4, 10)
W2 = np.random.randn(10, 3)

for i in range(iteration):
J_train, dW1, dW2 = two_layer_net(X_train, Y_train, W1, W2)
J_valid, _, _, = two_layer_net(X_valid, Y_valid, W1, W2)
W1 -= alpha * dW1
W2 -= alpha * dW2

history_train[i] = J_train
history_valid[i] = J_valid

if (i+1) % 50 == 0:
print("The training loss of the", i+1, "epoch is", history_train[i].round(4), ",", end="")
print("The validation loss of the", i+1, "epoch is", history_valid[i].round(4))

print("\n The loss of our testing set is", two_layer_net(X_test, Y_test, W1, W2),round(4))

# The training loss of the 50 epoch is 0.5166 ,The validation loss of the 50 epoch is 0.5436
# The training loss of the 100 epoch is 0.4769 ,The validation loss of the 100 epoch is 0.5154
# The training loss of the 150 epoch is 0.4492 ,The validation loss of the 150 epoch is 0.492
# The training loss of the 200 epoch is 0.3166 ,The validation loss of the 200 epoch is 0.2497
# The training loss of the 250 epoch is 0.244 ,The validation loss of the 250 epoch is 0.206
# The training loss of the 300 epoch is 0.1796 ,The validation loss of the 300 epoch is 0.1811
# The training loss of the 350 epoch is 0.122 ,The validation loss of the 350 epoch is 0.1271
# The training loss of the 400 epoch is 0.1348 ,The validation loss of the 400 epoch is 0.0477
# The training loss of the 450 epoch is 0.1349 ,The validation loss of the 450 epoch is 0.0451
# The training loss of the 500 epoch is 0.1336 ,The validation loss of the 500 epoch is 0.0412
# The training loss of the 550 epoch is 0.1308 ,The validation loss of the 550 epoch is 0.0373
# The training loss of the 600 epoch is 0.1283 ,The validation loss of the 600 epoch is 0.0343
# The training loss of the 650 epoch is 0.1264 ,The validation loss of the 650 epoch is 0.0322
# The training loss of the 700 epoch is 0.1249 ,The validation loss of the 700 epoch is 0.0306
# The training loss of the 750 epoch is 0.1238 ,The validation loss of the 750 epoch is 0.0293
# The training loss of the 800 epoch is 0.1228 ,The validation loss of the 800 epoch is 0.0282
# The training loss of the 850 epoch is 0.1221 ,The validation loss of the 850 epoch is 0.0272
# The training loss of the 900 epoch is 0.1215 ,The validation loss of the 900 epoch is 0.0261
# The training loss of the 950 epoch is 0.1212 ,The validation loss of the 950 epoch is 0.0251
# The training loss of the 1000 epoch is 0.1213 ,The validation loss of the 1000 epoch is 0.0239
#
# The loss of our testing set is 0.024856599493450916 4
``````

Tags:

Updated: