import dependencies
from tensorflow import keras from matplotlib import pyplot as plt from tensorflow.keras.layers import Conv2D, MaxPool2D, Flatten, Dense
Download dataset
The mnist dataset is a public handwritten digit dataset. There are a total of 7W 28*28 pixel 0-9 handwritten digit pictures and labels, of which 6W are training sets and 1W are test sets.
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
Among them, x_train is the training set feature, y_train is the training set label, x_test is the test set feature, and y_test is the test set label.
data normalization
The original gray value between 0-255 is changed to a value between 0-1, so that the gradient becomes gentle and it is easier to converge to find the optimal solution.
x_train, x_test = x_train / 255.0, x_test / 255.0
add dimension
Add a dimension to the dataset to make it 6W sheets of 28*28 single-channel data, and let the convolution kernel perform feature extraction.
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
one-hot code
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)
After one-hot encoding, each category corresponds to a status code, 1 is yes, 0 is no. If a picture label is 6, the one-hot code is: 0 0 0 0 0 0 1 0 0 0
Split validation set
Take 5000 samples from the training set as the validation set, and the validation set is used to participate in the training and update the gradient.
x_validation = x_train[:5000] y_validation = y_train[:5000] x_train = x_train[5000:] y_train = y_train[5000:]
build network structure
Using a three-layer convolution and two-layer fully connected network structure, the first layer of convolution uses 32 3*3 convolution kernels, and the second three-layer convolution uses 64 3*3 convolution kernels. The purpose of convolution is to extract the spatial features of the image, and the maximum pooling is to suppress over-fitting.
model = keras.models.Sequential([ Conv2D(32, (3, 3), activation='relu',input_shape=(28, 28, 1)), MaxPool2D((2, 2)), Conv2D(64, (3, 3), activation='relu'), MaxPool2D((2, 2)), Conv2D(64, (3, 3), activation='relu'), Flatten(), Dense(64, activation='relu'), Dense(10, activation='softmax') ])
Compile the model
Using the multi-category cross-entropy loss function, the optimizer chooses rmsprop, which can be selected under normal circumstances, it will not disappoint you, and this is also the default default optimizer.
model.compile(loss='categorical_crossentropy', optimizer='rmsprop',metrics=['accuracy'])
save model
checkpoint_save_path = "./checkpoint/mnist2.ckpt" cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,save_weights_only=True, save_best_only=True)
perform training
The dataset is fed into the neural network in batches of 32, with a total of 7 iterations, and the accuracy is tested once per iteration.
history = model.fit(x_train, y_train, batch_size=32, epochs=7, verbose=1, validation_data=(x_validation,y_validation),validation_freq=1,callbacks=[cp_callback])
Evaluation model
score = model.evaluate(x_test, y_test, verbose=0, batch_size=32) print('test accuracy:{}, test loss value: {}'.format(score[1], score[0]))
Visualize acc and loss curves
plt.rcParams['font.sans-serif']=['SimHei'] acc = history.history['accuracy'] val_acc = history.history['val_accuracy'] loss = history.history['loss'] val_loss = history.history['val_loss'] plt.subplot(1, 2, 1) plt.plot(acc, label='train Acc') plt.plot(val_acc, label='test Acc') plt.title('Acc curve') plt.legend() plt.subplot(1, 2, 2) plt.plot(loss, label='train Loss') plt.plot(val_loss, label='test Loss') plt.title('Loss curve') plt.legend() plt.show()
At this point, run the program. After the training is completed, the training images of acc and loss will be displayed, and the checkpoint folder will appear in the current directory.
It can be seen that the neural network with convolution calculation has been added, and the effect has been improved to a certain extent, and the accuracy of the model test has reached 99%.
Reproduce the network structure
After the training is completed, an application should be written next to receive pictures, recognize pictures, and return the recognition results.
So I open a new py file here
from PIL import Image import numpy as np import tensorflow as tf from tensorflow import keras from tensorflow.keras.layers import Conv2D, MaxPool2D, Flatten, Dense
First, reproduce the network structure during training
model = keras.models.Sequential([ Conv2D(32, (3, 3), activation='relu',input_shape=(28, 28, 1)), MaxPool2D((2, 2)), Conv2D(64, (3, 3), activation='relu'), MaxPool2D((2, 2)), Conv2D(64, (3, 3), activation='relu'), Flatten(), Dense(64, activation='relu'), Dense(10, activation='softmax') ])
load model
model_save_path = './checkpoint/mnist2.ckpt' model.load_weights(model_save_path)
image recognition
I drew ten images in Photoshop for identification
imgs = ['./img/p_0.jpg','./img/p_1.jpg','./img/p_2.jpg','./img/p_3.jpg','./img/p_4.jpg','./img/p_5.jpg','./img/p_6.jpg','./img/p_7.jpg','./img/p_8.jpg','./img/p_9.jpg'] for path in imgs: #read image img = Image.open(path) img = img.resize((28, 28), Image.ANTIALIAS) img_arr = np.array(img.convert('L')) #The training image is black and white, but the image we recognize is black and white, so the color needs to be reversed #Convert the pixel value to two extreme values of 0 and 255, while retaining the useful information of the image, filter out the background noise and make the image cleaner for i in range(28): for j in range(28): if img_arr[i][j] < 150: img_arr[i][j] = 255 else: img_arr[i][j] = 0 # Normalized img_arr = img_arr / 255.0 # add a dimension x_predict = img_arr.reshape(1, 28, 28, 1) # Identify result = model.predict(x_predict) pred = tf.argmax(result[0]) print('Identifying:{} ---- > {}'.format(path, pred))
operation result: