Training, verification and reasoning
Use the same layer diagram to define multiple models
All models are callable, just like layers
Multiple input multiple output model
Extension of API: using custom layer
Official tutorial: https://tensorflow.google.cn/guide/keras/functional#training_evaluation_and_inference
Build a simple model
Setup
Import required modules:
1 from __future__ import absolute_import, division, print_function, unicode_literals 2 3 import numpy as np 4 5 import tensorflow as tf 6 7 from tensorflow import keras 8 from tensorflow.keras import layers
Introduction
What is the Keras functional API? It is relative to TF Keras. What are the advantages of sequential API?
Keras functional API is a way to generate models, which is relative to TF keras. Sequential API is more flexible. Functional APIs can handle models with nonlinear topologies, models with shared layers, and models with multiple inputs or outputs.
The main idea of deep learning model is hierarchical directed acyclic graph (DAG). Therefore, functional API is a method to build layer diagram.
For example, the following network is constructed, including the classification problem of three full connection layers.
(input: 784-dimensional vectors) ↧ [Dense (64 units, relu activation)] ↧ [Dense (64 units, relu activation)] ↧ [Dense (10 units, softmax activation)] ↧ (output: logits of a probability distribution over 10 classes)
Steps to use functional API s:
- First, generate an input node:
inputs = keras.Input(shape=(784,))
If the input is an image with shape (32,32,3), the input node can be generated as follows:
1 # Just for demonstration purposes. 2 img_inputs = keras.Input(shape=(32, 32, 3))
The above generated inputs include shape, dtype and other information:
1 inputs.shape 2 inputs.dtype
The following information will be returned:
TensorShape([None, 784])
tf.float32
- Generate new nodes in the graph of layers and call back through inputs.
1 dense = layers.Dense(64, activation='relu') 2 x = dense(inputs)
The above operation is equivalent to inputting the input into the created deny layer and returning the output x; You can actually simplify the above code into one line.
- Add the second layer node and the third layer node in the same way as above.
1 x = layers.Dense(64, activation='relu')(x) 2 outputs = layers.Dense(10)(x)
- At this time, you can create the final model by specifying its input and output in the layer diagram:
model = keras.Model(inputs=inputs, outputs=outputs, name='mnist_model')
Via keras Model () method, combined with input and output, is integrated into the final model.
- After the model is generated, you can print the built model, just like the table of model structure in paper:
model.summary()
Model: "mnist_model" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) [(None, 784)] 0 _________________________________________________________________ dense (Dense) (None, 64) 50240 _________________________________________________________________ dense_1 (Dense) (None, 64) 4160 _________________________________________________________________ dense_2 (Dense) (None, 10) 650 ================================================================= Total params: 55,050 Trainable params: 55,050 Non-trainable params: 0 _________________________________________________________________
-
At the same time, you can also draw the image of the model structure:
- Show model structure only:
keras.utils.plot_model(model, 'my_first_model.png')
- Display model structure and input / output size: [add parameter show_shapes=True]
1 keras.utils.plot_model(model, 'my_first_model_with_shape_info.png', show_shapes=True)
Training, verification and reasoning
This part is handled in the same way as Sequential models.
The following is the mnist data set for training, verification and testing:
1 (x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data() 2 3 x_train = x_train.reshape(60000, 784).astype('float32') / 255 4 x_test = x_test.reshape(10000, 784).astype('float32') / 255 5 6 model.compile(loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True), 7 optimizer=keras.optimizers.RMSprop(), 8 metrics=['accuracy']) 9 10 history = model.fit(x_train, y_train, 11 batch_size=64, 12 epochs=5, 13 validation_split=0.2) 14 15 test_scores = model.evaluate(x_test, y_test, verbose=2) 16 print('Test loss:', test_scores[0]) 17 print('Test accuracy:', test_scores[1])
Detailed guidelines for training and verification are detailed in: https://tensorflow.google.cn/guide/keras/train_and_evaluate
Saving and restoring model
For model saving, the way of functional API is the same as that of serialization model. The standard way is through model Save() to save the model, keras models. load_ Model() to restore the model.
The saved file contains:
- Structure of model
- Weight of model
- Training configuration parameters of the model
- Optimizer and its state
1 model.save('path_to_my_model') 2 del model 3 # Recreate the exact same model purely from the file: 4 model = keras.models.load_model('path_to_my_model')
See the following for the guidelines for saving and restoring models: https://tensorflow.google.cn/guide/keras/save_and_serialize
Use the same layer diagram to define multiple models
In functional API, models are generated by materializing their inputs and outputs [keras.Model(inputs=, outputs =)], which means that one layer diagram can be used to generate multiple models. (through different inputs and outputs)
The following code includes two parts: encoder and decoder. Encoder is equivalent to the convolution process of FCN and decoder is equivalent to the deconvolution process of FCN.
That is to say, Conv2D layer and conv2dtransfer operate against each other; MaxPooling2D and UpSampling2D operate inversely to each other. Convolution and deconvolution, pooling and de pooling.
1 encoder_input = keras.Input(shape=(28, 28, 1), name='img') 2 x = layers.Conv2D(16, 3, activation='relu')(encoder_input) 3 x = layers.Conv2D(32, 3, activation='relu')(x) 4 x = layers.MaxPooling2D(3)(x) 5 x = layers.Conv2D(32, 3, activation='relu')(x) 6 x = layers.Conv2D(16, 3, activation='relu')(x) 7 encoder_output = layers.GlobalMaxPooling2D()(x) 8 9 encoder = keras.Model(encoder_input, encoder_output, name='encoder') 10 encoder.summary() 11 12 x = layers.Reshape((4, 4, 1))(encoder_output) 13 x = layers.Conv2DTranspose(16, 3, activation='relu')(x) 14 x = layers.Conv2DTranspose(32, 3, activation='relu')(x) 15 x = layers.UpSampling2D(3)(x) 16 x = layers.Conv2DTranspose(16, 3, activation='relu')(x) 17 decoder_output = layers.Conv2DTranspose(1, 3, activation='relu')(x) 18 19 autoencoder = keras.Model(encoder_input, decoder_output, name='autoencoder') 20 autoencoder.summary()
It is worth noting that the above code is based on the encoder and then establishes the decoder. When establishing the decoder, the input of the encoder model and the output of the decoder are used as keras Input parameters of model. That is, the so-called end2end, end-to-end operation.
All models are callable, just like layers
You can think of any model as a layer by invoking the input or the output of another layer. By invoking the model, not only the architecture of the model, but also its weight is reused.
In order to see its role, here is an example of a different automatic encoder. It creates an encoder model and a decoder model, and links them in two calls to obtain the automatic encoder model:
The previous model was built end-to-end by using the input of encoder model as the input of decoder. Of course, the input of the decoder model can also be established, and the end-to-end model can be realized through the call of the two models.
As shown in line 12 of the following code, the input of the decoder model is constructed;
Lines 23-26, enter autoencoder by creating a new automatic encoder_ Input, and then through the separate call of the two models (like the call of the layer), the model can be established again in line 26. The model can be called like a layer to generate a new model (keras.Model(inputs, outputs)).
1 encoder_input = keras.Input(shape=(28, 28, 1), name='original_img') 2 x = layers.Conv2D(16, 3, activation='relu')(encoder_input) 3 x = layers.Conv2D(32, 3, activation='relu')(x) 4 x = layers.MaxPooling2D(3)(x) 5 x = layers.Conv2D(32, 3, activation='relu')(x) 6 x = layers.Conv2D(16, 3, activation='relu')(x) 7 encoder_output = layers.GlobalMaxPooling2D()(x) 8 9 encoder = keras.Model(encoder_input, encoder_output, name='encoder') 10 encoder.summary() 11 12 decoder_input = keras.Input(shape=(16,), name='encoded_img') 13 x = layers.Reshape((4, 4, 1))(decoder_input) 14 x = layers.Conv2DTranspose(16, 3, activation='relu')(x) 15 x = layers.Conv2DTranspose(32, 3, activation='relu')(x) 16 x = layers.UpSampling2D(3)(x) 17 x = layers.Conv2DTranspose(16, 3, activation='relu')(x) 18 decoder_output = layers.Conv2DTranspose(1, 3, activation='relu')(x) 19 20 decoder = keras.Model(decoder_input, decoder_output, name='decoder') 21 decoder.summary() 22 23 autoencoder_input = keras.Input(shape=(28, 28, 1), name='img') 24 encoded_img = encoder(autoencoder_input) 25 decoded_img = decoder(encoded_img) 26 autoencoder = keras.Model(autoencoder_input, decoded_img, name='autoencoder') 27 autoencoder.summary()
The nesting of the above models is common in the integration algorithm, which is the re combination of a pile of weak learning machines (models).
The following code is really a simple and crude perceptron integration model, get_model() is a perceptron model composed of an input layer with 128 nodes and an output layer with 1 node. Code 10-15 behavior integration model construction, the same input, average output.
1 def get_model(): 2 inputs = keras.Input(shape=(128,)) 3 outputs = layers.Dense(1)(inputs) 4 return keras.Model(inputs, outputs) 5 6 model1 = get_model() 7 model2 = get_model() 8 model3 = get_model() 9 10 inputs = keras.Input(shape=(128,)) 11 y1 = model1(inputs) 12 y2 = model2(inputs) 13 y3 = model3(inputs) 14 outputs = layers.average([y1, y2, y3]) 15 ensemble_model = keras.Model(inputs=inputs, outputs=outputs)
Complex graph topology
Through the above introduction, you may think that there is no bright spot between keras functional API and Sequence API. That's because the above model structure is relatively simple. Next, explore its subtleties in the multi input multi output model and sharing layer.
Multiple input multiple output model
Functional API can easily solve the problem of multiple input and multiple output. This is difficult to handle in the Sequential API.
The following code contains two inputs. After the two inputs generate outputs through LSTM respectively, a feature is formed by feature splicing. Then two full connection operations are performed on the feature to produce two outputs. And the final model.
1 num_tags = 12 # Number of unique issue tags 2 num_words = 10000 # Size of vocabulary obtained when preprocessing text data 3 num_departments = 4 # Number of departments for predictions 4 5 title_input = keras.Input(shape=(None,), name='title') # Variable-length sequence of ints 6 body_input = keras.Input(shape=(None,), name='body') # Variable-length sequence of ints 7 tags_input = keras.Input(shape=(num_tags,), name='tags') # Binary vectors of size `num_tags` 8 9 # Embed each word in the title into a 64-dimensional vector 10 title_features = layers.Embedding(num_words, 64)(title_input) 11 # Embed each word in the text into a 64-dimensional vector 12 body_features = layers.Embedding(num_words, 64)(body_input) 13 14 # Reduce sequence of embedded words in the title into a single 128-dimensional vector 15 title_features = layers.LSTM(128)(title_features) 16 # Reduce sequence of embedded words in the body into a single 32-dimensional vector 17 body_features = layers.LSTM(32)(body_features) 18 19 # Merge all available features into a single large vector via concatenation 20 x = layers.concatenate([title_features, body_features, tags_input]) 21 22 # Stick a logistic regression for priority prediction on top of the features 23 priority_pred = layers.Dense(1, name='priority')(x) 24 # Stick a department classifier on top of the features 25 department_pred = layers.Dense(num_departments, name='department')(x) 26 27 # Instantiate an end-to-end model predicting both priority and department 28 model = keras.Model(inputs=[title_input, body_input, tags_input], 29 outputs=[priority_pred, department_pred])
Display its model structure, including three inputs and two outputs:
When compiling a model, since the model contains two outputs, you can set loss functions for the two models respectively.
1 model.compile(optimizer=keras.optimizers.RMSprop(1e-3), 2 loss=[keras.losses.BinaryCrossentropy(from_logits=True), 3 keras.losses.CategoricalCrossentropy(from_logits=True)], 4 loss_weights=[1., 0.2])
For the readability of the code, a dictionary can be used to declare the ownership of each loss function when setting the loss function.
1 model.compile(optimizer=keras.optimizers.RMSprop(1e-3), 2 loss={'priority':keras.losses.BinaryCrossentropy(from_logits=True), 3 'department': keras.losses.CategoricalCrossentropy(from_logits=True)}, 4 loss_weights=[1., 0.2])
Training model:
1 # Dummy input data 2 title_data = np.random.randint(num_words, size=(1280, 10)) 3 body_data = np.random.randint(num_words, size=(1280, 100)) 4 tags_data = np.random.randint(2, size=(1280, num_tags)).astype('float32') 5 6 # Dummy target data 7 priority_targets = np.random.random(size=(1280, 1)) 8 dept_targets = np.random.randint(2, size=(1280, num_departments)) 9 10 model.fit({'title': title_data, 'body': body_data, 'tags': tags_data}, 11 {'priority': priority_targets, 'department': dept_targets}, 12 epochs=2, 13 batch_size=32)
More detailed training verification guidelines: https://tensorflow.google.cn/guide/keras/train_and_evaluate
ResNet Model(toy version)
1 inputs = keras.Input(shape=(32, 32, 3), name='img') 2 x = layers.Conv2D(32, 3, activation='relu')(inputs) 3 x = layers.Conv2D(64, 3, activation='relu')(x) 4 block_1_output = layers.MaxPooling2D(3)(x) 5 6 x = layers.Conv2D(64, 3, activation='relu', padding='same')(block_1_output) 7 x = layers.Conv2D(64, 3, activation='relu', padding='same')(x) 8 block_2_output = layers.add([x, block_1_output]) 9 10 x = layers.Conv2D(64, 3, activation='relu', padding='same')(block_2_output) 11 x = layers.Conv2D(64, 3, activation='relu', padding='same')(x) 12 block_3_output = layers.add([x, block_2_output]) 13 14 x = layers.Conv2D(64, 3, activation='relu')(block_3_output) 15 x = layers.GlobalAveragePooling2D()(x) 16 x = layers.Dense(256, activation='relu')(x) 17 x = layers.Dropout(0.5)(x) 18 outputs = layers.Dense(10)(x) 19 20 model = keras.Model(inputs, outputs, name='toy_resnet') 21 model.summary()
Take a look at its structure:
keras.utils.plot_model(model, 'mini_resnet.png', show_shapes=True)
Then train the model:
1 (x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data() 2 3 x_train = x_train.astype('float32') / 255. 4 x_test = x_test.astype('float32') / 255. 5 y_train = keras.utils.to_categorical(y_train, 10) 6 y_test = keras.utils.to_categorical(y_test, 10) 7 8 model.compile(optimizer=keras.optimizers.RMSprop(1e-3), 9 loss=keras.losses.CategoricalCrossentropy(from_logits=True), 10 metrics=['acc']) 11 12 model.fit(x_train, y_train, 13 batch_size=64, 14 epochs=1, 15 validation_split=0.2)
Sharing layer
When it comes to the sharing layer, you will think of various models of Siamese Network series, including image classification, target tracking, feed shot target detection, etc. Knowing this kind of model, it is not difficult to understand the use of sharing layer in keras:
Declare two inputs and input them into the same layer to produce output.
1 # Embedding for 1000 unique words mapped to 128-dimensional vectors 2 shared_embedding = layers.Embedding(1000, 128) 3 4 # Variable-length sequence of integers 5 text_input_a = keras.Input(shape=(None,), dtype='int32') 6 7 # Variable-length sequence of integers 8 text_input_b = keras.Input(shape=(None,), dtype='int32') 9 10 # Reuse the same layer to encode both inputs 11 encoded_input_a = shared_embedding(text_input_a) 12 encoded_input_b = shared_embedding(text_input_b)
Extension of API: using custom layer
tf.keras contains a wide range of layers, such as:
- Convolutional layers: Conv1D, Conv2D, Conv3D, Conv2DTranspose
- Pooling layers: MaxPooling1D, MaxPooling2D, MaxPooling3D, AveragePooling1D
- RNN layers: GRU, LSTM, ConvLSTM2D
- BatchNormalization, Dropout, Embedding, etc.
You can use the layer you want, but you can't find the layer you want. Custom layers need to inherit layers Layer class and define build and call functions:
- The call function defines the specific forward calculation process;
- The build function initializes the weight of each layer;
1 class CustomDense(layers.Layer): 2 def __init__(self, units=32): 3 super(CustomDense, self).__init__() 4 self.units = units 5 6 def build(self, input_shape): 7 self.w = self.add_weight(shape=(input_shape[-1], self.units), 8 initializer='random_normal', 9 trainable=True) 10 self.b = self.add_weight(shape=(self.units,), 11 initializer='random_normal', 12 trainable=True) 13 14 def call(self, inputs): 15 return tf.matmul(inputs, self.w) + self.b 16 17 18 inputs = keras.Input((4,)) 19 outputs = CustomDense(10)(inputs) 20 21 model = keras.Model(inputs, outputs)
For serialization support in the custom layer, define a get_config method, which returns the constructor parameters of the layer instance:
1 class CustomDense(layers.Layer): 2 3 def __init__(self, units=32): 4 super(CustomDense, self).__init__() 5 self.units = units 6 7 def build(self, input_shape): 8 self.w = self.add_weight(shape=(input_shape[-1], self.units), 9 initializer='random_normal', 10 trainable=True) 11 self.b = self.add_weight(shape=(self.units,), 12 initializer='random_normal', 13 trainable=True) 14 15 def call(self, inputs): 16 return tf.matmul(inputs, self.w) + self.b 17 18 def get_config(self): # Add a new function to return the value in the constructor 19 return {'units': self.units} 20 21 22 inputs = keras.Input((4,)) 23 outputs = CustomDense(10)(inputs) 24 25 model = keras.Model(inputs, outputs) 26 config = model.get_config() # adopt get_config To get config information 27 28 new_model = keras.Model.from_config( 29 config, custom_objects={'CustomDense': CustomDense})