17 minute read

Tags: ,

MNIST

Fashion-MNIST is a dataset of Zalando’s article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. We intend Fashion-MNIST to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits.

A Commputer Vision Example

import tensorflow as tf
print(tf.__version__)
mnist = tf.kera.datasets.fashion_mnist
(training_images, training_labels), (test_images, test_labels) = mnist.load_data()
import numpy as np
np.set_printoptions(linewidth=200)
import matplotlib.pyplot as plt
plt.imshow(training_images[0])
print(training_labels[0])
print(training_images[0])


9
[[  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0   0   0   1   0   0  13  73   0   0   1   4   0   0   0   0   1   1   0]
 [  0   0   0   0   0   0   0   0   0   0   0   0   3   0  36 136 127  62  54   0   0   0   1   3   4   0   0   3]
 [  0   0   0   0   0   0   0   0   0   0   0   0   6   0 102 204 176 134 144 123  23   0   0   0   0  12  10   0]
 [  0   0   0   0   0   0   0   0   0   0   0   0   0   0 155 236 207 178 107 156 161 109  64  23  77 130  72  15]
 [  0   0   0   0   0   0   0   0   0   0   0   1   0  69 207 223 218 216 216 163 127 121 122 146 141  88 172  66]
 [  0   0   0   0   0   0   0   0   0   1   1   1   0 200 232 232 233 229 223 223 215 213 164 127 123 196 229   0]
 [  0   0   0   0   0   0   0   0   0   0   0   0   0 183 225 216 223 228 235 227 224 222 224 221 223 245 173   0]
 [  0   0   0   0   0   0   0   0   0   0   0   0   0 193 228 218 213 198 180 212 210 211 213 223 220 243 202   0]
 [  0   0   0   0   0   0   0   0   0   1   3   0  12 219 220 212 218 192 169 227 208 218 224 212 226 197 209  52]
 [  0   0   0   0   0   0   0   0   0   0   6   0  99 244 222 220 218 203 198 221 215 213 222 220 245 119 167  56]
 [  0   0   0   0   0   0   0   0   0   4   0   0  55 236 228 230 228 240 232 213 218 223 234 217 217 209  92   0]
 [  0   0   1   4   6   7   2   0   0   0   0   0 237 226 217 223 222 219 222 221 216 223 229 215 218 255  77   0]
 [  0   3   0   0   0   0   0   0   0  62 145 204 228 207 213 221 218 208 211 218 224 223 219 215 224 244 159   0]
 [  0   0   0   0  18  44  82 107 189 228 220 222 217 226 200 205 211 230 224 234 176 188 250 248 233 238 215   0]
 [  0  57 187 208 224 221 224 208 204 214 208 209 200 159 245 193 206 223 255 255 221 234 221 211 220 232 246   0]
 [  3 202 228 224 221 211 211 214 205 205 205 220 240  80 150 255 229 221 188 154 191 210 204 209 222 228 225   0]
 [ 98 233 198 210 222 229 229 234 249 220 194 215 217 241  65  73 106 117 168 219 221 215 217 223 223 224 229  29]
 [ 75 204 212 204 193 205 211 225 216 185 197 206 198 213 240 195 227 245 239 223 218 212 209 222 220 221 230  67]
 [ 48 203 183 194 213 197 185 190 194 192 202 214 219 221 220 236 225 216 199 206 186 181 177 172 181 205 206 115]
 [  0 122 219 193 179 171 183 196 204 210 213 207 211 210 200 196 194 191 195 191 198 192 176 156 167 177 210  92]
 [  0   0  74 189 212 191 175 172 175 181 185 188 189 188 193 198 204 209 210 210 211 188 188 194 192 216 170   0]
 [  2   0   0   0  66 200 222 237 239 242 246 243 244 221 220 193 191 179 182 182 181 176 166 168  99  58   0   0]
 [  0   0   0   0   0   0   0  40  61  44  72  41  35   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]]
  • normalizing
training_images  = training_images / 255.0
test_images = test_images / 255.0
  • model

    • Sequential: That defines a SEQUENCE of layers in the neural network

    • Flatten: Remember earlier where our images were a square, when you printed them out? Flatten just takes that square and turns it into a 1 dimensional set.

    • Dense: Adds a layer of neurons

    • Each layer of neurons need an activation function to tell them what to do. There’s lots of options, but just use these for now.

      • Relu effectively means “If X>0 return X, else return 0” – so what it does it it only passes values 0 or greater to the next layer in the network.

      • Softmax takes a set of values, and effectively picks the biggest one, so, for example, if the output of the last layer looks like [0.1, 0.1, 0.05, 0.1, 9.5, 0.1, 0.05, 0.05, 0.05], it saves you from fishing through it looking - for the biggest value, and turns it into [0,0,0,0,1,0,0,0,0] – The goal is to save a lot of coding!

model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(128, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax) ])
  • training
model.compile(optiizer = tf.optimizers.Adam(),
              loss = 'spare_categorical_crossentropy',
              metrics=['accuracy']
              )
model.fit(training_images, training_labes, epochs=5)
  • evaluation
model.evaluate(test_images, test_labels)

Explortation Exercises

  • ex1:

    • For this first exercise run the below code: It creates a set of classifications for each of the test images, and then prints the first entry in the classifications. The output, after you run it is a list of numbers. Why do you think this is, and what do those numbers represent?
      classifications = model.predict(test_images)
      print(classifications[0])
      # [1.15549312e-07 6.20523863e-08 1.13008866e-07 3.70938488e-11 8.58518447e-08 3.94219207e-03 6.69186761e-07 1.25284987e-02 3.47758873e-06 9.83524680e-01]
    
     print(test_lablesp[0])
     #9
    
      print(test_labels)
      print(classifications)
     #[9 2 1 ... 8 1 5]
     #[[1.15549312e-07 6.20523863e-08 1.13008866e-07 ... 1.25284987e-02 3.47758873e-06 9.83524680e-01]
     # [4.84347720e-06 1.41294303e-11 9.80557323e-01 ... 6.69249595e-13 9.69702718e-09 1.20127758e-12]
     # [6.25822224e-07 9.99999166e-01 1.14671002e-08 ... 2.02220033e-12 1.24577640e-10 6.51417332e-12]
     # ...
     # [6.89736858e-04 1.64704197e-08 3.33492499e-04 ... 2.52994232e-06 9.97599304e-01 2.85938668e-08]
     # [5.19804416e-07 9.99975324e-01 4.54881075e-08 ... 1.05116284e-10 2.83463155e-08 7.29924832e-09]
     # [1.08057313e-04 2.98763234e-06 4.91383660e-04 ... 1.12106279e-02 2.43145763e-03 1.52897643e-04]]
    
    • What dose this list represent? It’s the probability that this item is each of the 10 classes
  • ex2:

    • Let’s now look at the layers in your model. Experiment with different values for the dense layer with 512 neurons. What different results do you get for loss, training time etc? Why do you think that’s the case?
      import tensorflow as tf
      print(tf.__version__)
    
      mnist = tf.keras.datasets.mnist
    
      (training_images, training_labels) ,  (test_images, test_labels) = mnist.load_data()
        
      training_images = training_images/255.0
      test_images = test_images/255.0
        
      model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                          tf.keras.layers.Dense(1024, activation=tf.nn.relu),
                                          tf.keras.layers.Dense(10, activation=tf.nn.softmax)])
        
      model.compile(optimizer = 'adam',
                    loss = 'sparse_categorical_crossentropy')
        
      model.fit(training_images, training_labels, epochs=5)
        
      model.evaluate(test_images, test_labels)
        
      classifications = model.predict(test_images)
        
      print(classifications[0])
      print(test_labels[0])
    
    • Increase to 1024 Neurons – What’s the impact?
      • Training takes longer, but is more accurate. The correct answer is (1) by adding more Neurons we have to do more calculations, slowing down the process, but in this case they have a good impact – we do get more accurate. That doesn’t mean it’s always a case of ‘more is better’, you can hit the law of diminishing returns very quickly!
  • ex3:

What would happen if you remove the Flatten() layer. Why do you think that’s the case?

You get an error about the shape of the data. It may seem vague right now, but it reinforces the rule of thumb that the first layer in your network should be the same shape as your data. Right now our data is 28x28 images, and 28 layers of 28 neurons would be infeasible, so it makes more sense to ‘flatten’ that 28,28 into a 784x1. Instead of wriitng all the code to handle that ourselves, we add the Flatten() layer at the begining, and when the arrays are loaded into the model later, they’ll automatically be flattened for us.

import tensorflow as tf
print(tf.__version__)

mnist = tf.keras.datasets.mnist

(training_images, training_labels) ,  (test_images, test_labels) = mnist.load_data()

training_images = training_images/255.0
test_images = test_images/255.0

model = tf.keras.models.Sequential([#tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(64, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

model.compile(optimizer = 'adam',
              loss = 'sparse_categorical_crossentropy')

model.fit(training_images, training_labels, epochs=5)

model.evaluate(test_images, test_labels)

classifications = model.predict(test_images)

print(classifications[0])
print(test_labels[0])

2.3.0
Epoch 1/5
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-19-cd0411f1d295> in <module>()
     16               loss = 'sparse_categorical_crossentropy')
     17 
---> 18 model.fit(training_images, training_labels, epochs=5)
     19 
     20 model.evaluate(test_images, test_labels)

10 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs)
    971           except Exception as e:  # pylint:disable=broad-except
    972             if hasattr(e, "ag_error_metadata"):
--> 973               raise e.ag_error_metadata.to_exception(e)
    974             else:
    975               raise

...
/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/dispatch.py:201 wrapper
        return target(*args, **kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/backend.py:4783 sparse_categorical_crossentropy
        labels=target, logits=output)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/util/dispatch.py:201 wrapper
        return target(*args, **kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_ops.py:4176 sparse_softmax_cross_entropy_with_logits_v2
        labels=labels, logits=logits, name=name)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/util/dispatch.py:201 wrapper
        return target(*args, **kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_ops.py:4091 sparse_softmax_cross_entropy_with_logits
        logits.get_shape()))

 ValueError: Shape mismatch: The shape of labels (received (32, 1)) should equal the shape of logits except for the last dimension (received (32, 28, 10)).
  • ex4:

Consider the final (output) layers. Why are there 10 of them? What would happen if you had a different amount than 10? For example, try training the network with 5

You get an error as soon as it finds an unexpected value. Another rule of thumb – the number of neurons in the last layer should match the number of classes you are classifying for. In this case it’s the digits 0-9, so there are 10 of them, hence you should have 10 neurons in your final layer.

import tensorflow as tf
print(tf.__version__)

mnist = tf.keras.datasets.mnist

(training_images, training_labels) ,  (test_images, test_labels) = mnist.load_data()

training_images = training_images/255.0
test_images = test_images/255.0

model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(64, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(5, activation=tf.nn.softmax)])

model.compile(optimizer = 'adam',
              loss = 'sparse_categorical_crossentropy')

model.fit(training_images, training_labels, epochs=5)

model.evaluate(test_images, test_labels)

classifications = model.predict(test_images)

print(classifications[0])
print(test_labels[0])

2.3.0
Epoch 1/5
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-20-04e5ea87fa1d> in <module>()
     16               loss = 'sparse_categorical_crossentropy')
     17 
---> 18 model.fit(training_images, training_labels, epochs=5)
     19 
     20 model.evaluate(test_images, test_labels)

8 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     58     ctx.ensure_initialized()
     59     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 60                                         inputs, attrs, num_outputs)
     61   except core._NotOkStatusException as e:
     62     if name is not None:

InvalidArgumentError:  Received a label value of 9 which is outside the valid range of [0, 5).  Label values: 0 9 4 1 7 9 0 7 8 3 3 8 1 1 1 7 1 4 6 0 5 9 0 8 5 3 1 2 3 3 9 0
	 [[node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits (defined at <ipython-input-20-04e5ea87fa1d>:18) ]] [Op:__inference_train_function_43090]

Function call stack:
train_function
  • ex5:

Consider the effects of additional layers in the network. What will happen if you add another layer between the one with 512 and the final layer with 10.

Ans: There isn’t a significant impact – because this is relatively simple data. For far more complex data (including color images to be classified as flowers that you’ll see in the next lesson), extra layers are often necessary.

import tensorflow as tf
print(tf.__version__)

mnist = tf.keras.datasets.mnist

(training_images, training_labels) ,  (test_images, test_labels) = mnist.load_data()

training_images = training_images/255.0
test_images = test_images/255.0

model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(512, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(256, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

model.compile(optimizer = 'adam',
              loss = 'sparse_categorical_crossentropy')

model.fit(training_images, training_labels, epochs=5)

model.evaluate(test_images, test_labels)

classifications = model.predict(test_images)

print(classifications[0])
print(test_labels[0])

# 2.3.0
# Epoch 1/5
# 1875/1875 [==============================] - 9s 5ms/step - loss: 0.1858
# Epoch 2/5
# 1875/1875 [==============================] - 9s 5ms/step - loss: 0.0815
# Epoch 3/5
# 1875/1875 [==============================] - 9s 5ms/step - loss: 0.0549
# Epoch 4/5
# 1875/1875 [==============================] - 9s 5ms/step - loss: 0.0437
# Epoch 5/5
# 1875/1875 [==============================] - 9s 5ms/step - loss: 0.0339
# 313/313 [==============================] - 1s 2ms/step - loss: 0.0770
# [5.10323894e-10 1.01987865e-07 1.32720421e-08 2.80876975e-06 2.11587947e-09 5.66969735e-11 1.# 52477518e-13 9.99989986e-01 1.04827818e-08 6.97232826e-06]
# 7
  • ex6:

Consider the impact of training for more or less epochs. Why do you think that would be the case?

Try 15 epochs – you’ll probably get a model with a much better loss than the one with 5 Try 30 epochs – you might see the loss value stops decreasing, and sometimes increases. This is a side effect of something called ‘overfitting’ which you can learn about [somewhere] and it’s something you need to keep an eye out for when training neural networks. There’s no point in wasting your time training if you aren’t improving your loss, right! :)

import tensorflow as tf
print(tf.__version__)

mnist = tf.keras.datasets.mnist

(training_images, training_labels) ,  (test_images, test_labels) = mnist.load_data()

training_images = training_images/255.0
test_images = test_images/255.0

model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(128, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

model.compile(optimizer = 'adam',
              loss = 'sparse_categorical_crossentropy')

model.fit(training_images, training_labels, epochs=30)

model.evaluate(test_images, test_labels)

classifications = model.predict(test_images)

print(classifications[34])
print(test_labels[34])


# 2.3.0
# Epoch 1/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.2581
# Epoch 2/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.1136
# Epoch 3/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0788
# Epoch 4/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0597
# Epoch 5/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0455
# Epoch 6/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0362
# Epoch 7/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0284
# Epoch 8/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0224
# Epoch 9/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0196
# Epoch 10/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0169
# Epoch 11/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0140
# Epoch 12/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0107
# Epoch 13/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0098
# Epoch 14/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0079
# Epoch 15/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0088
# Epoch 16/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0064
# Epoch 17/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0083
# Epoch 18/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0049
# Epoch 19/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0067
# Epoch 20/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0048
# Epoch 21/30
# 1875/1875 [==============================] - 4s 2ms/step - loss: 0.0054
# Epoch 22/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0042
# Epoch 23/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0037
# Epoch 24/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0056
# Epoch 25/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0055
# Epoch 26/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0045
# Epoch 27/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0037
# Epoch 28/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0048
# Epoch 29/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0034
# Epoch 30/30
# 1875/1875 [==============================] - 3s 2ms/step - loss: 0.0022
# 313/313 [==============================] - 0s 1ms/step - loss: 0.1460
# [1.6072160e-21 3.9224496e-19 1.2670014e-14 3.1439825e-12 1.1891110e-31 6.6834576e-30 6.5830020e-23 1.# 0000000e+00 9.1939640e-24 4.5170698e-20]
# 7
  • ex7:

Before you trained, you normalized the data, going from values that were 0-255 to values that were 0-1. What would be the impact of removing that? Here’s the complete code to give it a try. Why do you think you get different results?

會變超級爛!!!! XDD

import tensorflow as tf
print(tf.__version__)
mnist = tf.keras.datasets.mnist
(training_images, training_labels), (test_images, test_labels) = mnist.load_data()
# training_images=training_images/255.0
# test_images=test_images/255.0
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')
model.fit(training_images, training_labels, epochs=5)
model.evaluate(test_images, test_labels)
classifications = model.predict(test_images)
print(classifications[0])
print(test_labels[0])

# 2.3.0
# Epoch 1/5
# 1875/1875 [==============================] - 6s 3ms/step - loss: 2.6504
# Epoch 2/5
# 1875/1875 [==============================] - 6s 3ms/step - loss: 0.3485
# Epoch 3/5
# 1875/1875 [==============================] - 6s 3ms/step - loss: 0.3055
# Epoch 4/5
# 1875/1875 [==============================] - 6s 3ms/step - loss: 0.2753
# Epoch 5/5
# 1875/1875 [==============================] - 6s 3ms/step - loss: 0.2395
# 313/313 [==============================] - 1s 2ms/step - loss: 0.2769
# [0.0000000e+00 8.6906487e-21 2.5140721e-11 1.3673088e-13 3.1782340e-24 8.6635416e-28 0.0000000e+00 1.# 0000000e+00 1.9407528e-20 1.9335949e-17]
# 7
  • ex8:

callback 誒~我之前 好像沒怎麼用XDDD

Earlier when you trained for extra epochs you had an issue where your loss might change. It might have taken a bit of time for you to wait for the training to do that, and you might have thought ‘wouldn’t it be nice if I could stop the training when I reach a desired value?’ – i.e. 95% accuracy might be enough for you, and if you reach that after 3 epochs, why sit around waiting for it to finish a lot more epochs….So how would you fix that? Like any other program…you have callbacks! Let’s see them in action…

import tensorflow as tf
print(tf.__version__)

class myCallback(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs={}):
        if logs.get('loss') < 0.4 :
            print("\nReaached 60% accurcy so cancelling training")
            self.model.stop_training = True

callbacks = myCallback()
mnist = tf.keras.datasets.fashion_mnist
(training_images, training_labels), (test_images, test_labels) = mnist.load_data()
training_images=training_images/255.0
test_images=test_images/255.0
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')
model.fit(training_images, training_labels, epochs=5, callbacks=[callbacks])


2.3.0
Epoch 1/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.4732
Epoch 2/5
1874/1875 [============================>.] - ETA: 0s - loss: 0.3589
Reached 60% accuracy so cancelling training!
1875/1875 [==============================] - 6s 3ms/step - loss: 0.3589
<tensorflow.python.keras.callbacks.History at 0x7fb151993f60>

考試 (我考100 :))

  1. What’s the name of the dataset of Fashion images used in this week’s code?
    • Fashon MNIST
  2. What do the above mentioned Images look like?
    • 28x28 Greyscale
  3. How many images are in the Fashion MNIST dataset?
    • 70,000
  4. Why are there 10 output neurons?
    • There are 10 different labels
  5. What does Relu do?
    • It only returns x if x is greater than zero
  6. Why do you split data into training and test sets?
    • To test a network with previously unseen data
  7. What method gets called when an epoch finishes?
    • on_epoch_end
  8. What parameter to you set in your fit function to tell it to use callbacks?
    • callbcks=

Exercise 2

In the course you learned how to do classificaiton using Fashion MNIST, a data set containing items of clothing. There’s another, similar dataset called MNIST which has items of handwriting – the digits 0 through 9.

Write an MNIST classifier that trains to 99% accuracy or above, and does it without a fixed number of epochs – i.e. you should stop training once you reach that level of accuracy.

Some notes:

It should succeed in less than 10 epochs, so it is okay to change epochs= to 10, but nothing larger When it reaches 99% or greater it should print out the string “Reached 99% accuracy so cancelling training!” If you add any additional variables, make sure you use the same names as the ones used in the class I’ve started the code for you below – how would you finish it?

class yutingCoolCallback(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs={}):
        if logs.get('acc') > 0.99 :
            print("\nReached 99% accuracy so cancelling training!")
            self.model.stop_training = True

# GRADED FUNCTION: train_mnist
def train_mnist():
    # Please write your code only where you are indicated.
    # please do not remove # model fitting inline comments.

    # YOUR CODE SHOULD START HERE
#     import tensorflow as tf
    print(tf.__version__)
    # YOUR CODE SHOULD END HERE
    
    mnist = tf.keras.datasets.mnist

    (x_train, y_train),(x_test, y_test) = mnist.load_data(path=path)
    # YOUR CODE SHOULD START HERE
    x_train = x_train / 255.0
    x_test = x_test / 255.0
    # YOUR CODE SHOULD END HERE
    model = tf.keras.models.Sequential([
        # YOUR CODE SHOULD START HERE
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(128, activation=tf.nn.relu),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax)
        # YOUR CODE SHOULD END HERE
    ])

    model.compile(optimizer='adam',
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])
    yutingCallback = yutingCoolCallback()
    # model fitting
    history = model.fit(# YOUR CODE SHOULD START HERE
               x_train, y_train, epochs=7, callbacks=[yutingCallback]
              # YOUR CODE SHOULD END HERE
    )
    # model fitting
    return history.epoch, history.history['acc'][-1]
train_mnist()

1.14.0
Epoch 1/7
60000/60000 [==============================] - 10s 158us/sample - loss: 0.2558 - acc: 0.9265
Epoch 2/7
60000/60000 [==============================] - 9s 157us/sample - loss: 0.1126 - acc: 0.9664
Epoch 3/7
60000/60000 [==============================] - 9s 155us/sample - loss: 0.0778 - acc: 0.9765
Epoch 4/7
60000/60000 [==============================] - 9s 155us/sample - loss: 0.0592 - acc: 0.9819
Epoch 5/7
60000/60000 [==============================] - 9s 152us/sample - loss: 0.0442 - acc: 0.9867
Epoch 6/7
60000/60000 [==============================] - 9s 153us/sample - loss: 0.0362 - acc: 0.9887
Epoch 7/7
59872/60000 [============================>.] - ETA: 0s - loss: 0.0285 - acc: 0.9911
Reached 99% accuracy so cancelling training!
60000/60000 [==============================] - 9s 153us/sample - loss: 0.0285 - acc: 0.9911

Reading Resources: