[tf.keras] official tutorial two functional API

Build a simple model

Setup

Introduction

Training, verification and reasoning

Saving and restoring model

Use the same layer diagram to define multiple models

All models are callable, just like layers

Complex graph topology

Multiple input multiple output model

ResNet Model(toy version)

Sharing layer

Extension of API: using custom layer

Official tutorial: https://tensorflow.google.cn/guide/keras/functional#training_evaluation_and_inference  

Build a simple model

Setup

Import required modules:

1 from __future__ import absolute_import, division, print_function, unicode_literals
2 
3 import numpy as np
4 
5 import tensorflow as tf
6 
7 from tensorflow import keras
8 from tensorflow.keras import layers

Introduction

What is the Keras functional API? It is relative to TF Keras. What are the advantages of sequential API?

Keras functional API is a way to generate models, which is relative to TF keras. Sequential API is more flexible. Functional APIs can handle models with nonlinear topologies, models with shared layers, and models with multiple inputs or outputs.

The main idea of deep learning model is hierarchical directed acyclic graph (DAG). Therefore, functional API is a method to build layer diagram.

For example, the following network is constructed, including the classification problem of three full connection layers.

(input: 784-dimensional vectors)
       ↧
[Dense (64 units, relu activation)]
       ↧
[Dense (64 units, relu activation)]
       ↧
[Dense (10 units, softmax activation)]
       ↧
(output: logits of a probability distribution over 10 classes)

Steps to use functional API s:

  • First, generate an input node:
inputs = keras.Input(shape=(784,))

If the input is an image with shape (32,32,3), the input node can be generated as follows:

1 # Just for demonstration purposes. 
2 img_inputs = keras.Input(shape=(32, 32, 3))

The above generated inputs include shape, dtype and other information:

1 inputs.shape
2 inputs.dtype

The following information will be returned:

TensorShape([None, 784])
tf.float32
  • Generate new nodes in the graph of layers and call back through inputs.
1 dense = layers.Dense(64, activation='relu')
2  x = dense(inputs)

The above operation is equivalent to inputting the input into the created deny layer and returning the output x; You can actually simplify the above code into one line.

  • Add the second layer node and the third layer node in the same way as above.
1 x = layers.Dense(64, activation='relu')(x) 
2 outputs = layers.Dense(10)(x)
  • At this time, you can create the final model by specifying its input and output in the layer diagram:
model = keras.Model(inputs=inputs, outputs=outputs, name='mnist_model')

Via keras Model () method, combined with input and output, is integrated into the final model.

  • After the model is generated, you can print the built model, just like the table of model structure in paper:
model.summary()
Model: "mnist_model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 784)]             0         
_________________________________________________________________
dense (Dense)                (None, 64)                50240     
_________________________________________________________________
dense_1 (Dense)              (None, 64)                4160      
_________________________________________________________________
dense_2 (Dense)              (None, 10)                650       
=================================================================
Total params: 55,050
Trainable params: 55,050
Non-trainable params: 0
_________________________________________________________________
  • At the same time, you can also draw the image of the model structure:
    • Show model structure only:
keras.utils.plot_model(model, 'my_first_model.png')

  • Display model structure and input / output size: [add parameter show_shapes=True]
1 keras.utils.plot_model(model, 'my_first_model_with_shape_info.png', show_shapes=True)

Training, verification and reasoning

This part is handled in the same way as Sequential models.

The following is the mnist data set for training, verification and testing:

 1 (x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
 2 
 3 x_train = x_train.reshape(60000, 784).astype('float32') / 255
 4 x_test = x_test.reshape(10000, 784).astype('float32') / 255
 5 
 6 model.compile(loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
 7               optimizer=keras.optimizers.RMSprop(),
 8               metrics=['accuracy'])
 9 
10 history = model.fit(x_train, y_train,
11                     batch_size=64,
12                     epochs=5,
13                     validation_split=0.2)
14 
15 test_scores = model.evaluate(x_test, y_test, verbose=2)
16 print('Test loss:', test_scores[0])
17 print('Test accuracy:', test_scores[1])

Detailed guidelines for training and verification are detailed in: https://tensorflow.google.cn/guide/keras/train_and_evaluate

Saving and restoring model

For model saving, the way of functional API is the same as that of serialization model. The standard way is through model Save() to save the model, keras models. load_ Model() to restore the model.

The saved file contains:

  • Structure of model
  • Weight of model
  • Training configuration parameters of the model
  • Optimizer and its state
1 model.save('path_to_my_model')
2 del model
3 # Recreate the exact same model purely from the file:
4 model = keras.models.load_model('path_to_my_model')

See the following for the guidelines for saving and restoring models: https://tensorflow.google.cn/guide/keras/save_and_serialize

Use the same layer diagram to define multiple models

In functional API, models are generated by materializing their inputs and outputs [keras.Model(inputs=, outputs =)], which means that one layer diagram can be used to generate multiple models. (through different inputs and outputs)

The following code includes two parts: encoder and decoder. Encoder is equivalent to the convolution process of FCN and decoder is equivalent to the deconvolution process of FCN.

That is to say, Conv2D layer and conv2dtransfer operate against each other; MaxPooling2D and UpSampling2D operate inversely to each other. Convolution and deconvolution, pooling and de pooling.

 1 encoder_input = keras.Input(shape=(28, 28, 1), name='img')
 2 x = layers.Conv2D(16, 3, activation='relu')(encoder_input)
 3 x = layers.Conv2D(32, 3, activation='relu')(x)
 4 x = layers.MaxPooling2D(3)(x)
 5 x = layers.Conv2D(32, 3, activation='relu')(x)
 6 x = layers.Conv2D(16, 3, activation='relu')(x)
 7 encoder_output = layers.GlobalMaxPooling2D()(x)
 8 
 9 encoder = keras.Model(encoder_input, encoder_output, name='encoder')
10 encoder.summary()
11 
12 x = layers.Reshape((4, 4, 1))(encoder_output)
13 x = layers.Conv2DTranspose(16, 3, activation='relu')(x)
14 x = layers.Conv2DTranspose(32, 3, activation='relu')(x)
15 x = layers.UpSampling2D(3)(x)
16 x = layers.Conv2DTranspose(16, 3, activation='relu')(x)
17 decoder_output = layers.Conv2DTranspose(1, 3, activation='relu')(x)
18 
19 autoencoder = keras.Model(encoder_input, decoder_output, name='autoencoder')
20 autoencoder.summary()

It is worth noting that the above code is based on the encoder and then establishes the decoder. When establishing the decoder, the input of the encoder model and the output of the decoder are used as keras Input parameters of model. That is, the so-called end2end, end-to-end operation.   

All models are callable, just like layers

You can think of any model as a layer by invoking the input or the output of another layer. By invoking the model, not only the architecture of the model, but also its weight is reused.

In order to see its role, here is an example of a different automatic encoder. It creates an encoder model and a decoder model, and links them in two calls to obtain the automatic encoder model:

  

The previous model was built end-to-end by using the input of encoder model as the input of decoder. Of course, the input of the decoder model can also be established, and the end-to-end model can be realized through the call of the two models.

As shown in line 12 of the following code, the input of the decoder model is constructed;

Lines 23-26, enter autoencoder by creating a new automatic encoder_ Input, and then through the separate call of the two models (like the call of the layer), the model can be established again in line 26. The model can be called like a layer to generate a new model (keras.Model(inputs, outputs)).

 1 encoder_input = keras.Input(shape=(28, 28, 1), name='original_img')
 2 x = layers.Conv2D(16, 3, activation='relu')(encoder_input)
 3 x = layers.Conv2D(32, 3, activation='relu')(x)
 4 x = layers.MaxPooling2D(3)(x)
 5 x = layers.Conv2D(32, 3, activation='relu')(x)
 6 x = layers.Conv2D(16, 3, activation='relu')(x)
 7 encoder_output = layers.GlobalMaxPooling2D()(x)
 8 
 9 encoder = keras.Model(encoder_input, encoder_output, name='encoder')
10 encoder.summary()
11 
12 decoder_input = keras.Input(shape=(16,), name='encoded_img')
13 x = layers.Reshape((4, 4, 1))(decoder_input)
14 x = layers.Conv2DTranspose(16, 3, activation='relu')(x)
15 x = layers.Conv2DTranspose(32, 3, activation='relu')(x)
16 x = layers.UpSampling2D(3)(x)
17 x = layers.Conv2DTranspose(16, 3, activation='relu')(x)
18 decoder_output = layers.Conv2DTranspose(1, 3, activation='relu')(x)
19 
20 decoder = keras.Model(decoder_input, decoder_output, name='decoder')
21 decoder.summary()
22 
23 autoencoder_input = keras.Input(shape=(28, 28, 1), name='img')
24 encoded_img = encoder(autoencoder_input)
25 decoded_img = decoder(encoded_img)
26 autoencoder = keras.Model(autoencoder_input, decoded_img, name='autoencoder')
27 autoencoder.summary()

The nesting of the above models is common in the integration algorithm, which is the re combination of a pile of weak learning machines (models).

The following code is really a simple and crude perceptron integration model, get_model() is a perceptron model composed of an input layer with 128 nodes and an output layer with 1 node. Code 10-15 behavior integration model construction, the same input, average output.

 1 def get_model():
 2   inputs = keras.Input(shape=(128,))
 3   outputs = layers.Dense(1)(inputs)
 4   return keras.Model(inputs, outputs)
 5 
 6 model1 = get_model()
 7 model2 = get_model()
 8 model3 = get_model()
 9 
10 inputs = keras.Input(shape=(128,))
11 y1 = model1(inputs)
12 y2 = model2(inputs)
13 y3 = model3(inputs)
14 outputs = layers.average([y1, y2, y3])
15 ensemble_model = keras.Model(inputs=inputs, outputs=outputs)

Complex graph topology

Through the above introduction, you may think that there is no bright spot between keras functional API and Sequence API. That's because the above model structure is relatively simple. Next, explore its subtleties in the multi input multi output model and sharing layer.

Multiple input multiple output model

Functional API can easily solve the problem of multiple input and multiple output. This is difficult to handle in the Sequential API.

The following code contains two inputs. After the two inputs generate outputs through LSTM respectively, a feature is formed by feature splicing. Then two full connection operations are performed on the feature to produce two outputs. And the final model.

 1 num_tags = 12  # Number of unique issue tags
 2 num_words = 10000  # Size of vocabulary obtained when preprocessing text data
 3 num_departments = 4  # Number of departments for predictions
 4 
 5 title_input = keras.Input(shape=(None,), name='title')  # Variable-length sequence of ints
 6 body_input = keras.Input(shape=(None,), name='body')  # Variable-length sequence of ints
 7 tags_input = keras.Input(shape=(num_tags,), name='tags')  # Binary vectors of size `num_tags`
 8 
 9 # Embed each word in the title into a 64-dimensional vector
10 title_features = layers.Embedding(num_words, 64)(title_input)
11 # Embed each word in the text into a 64-dimensional vector
12 body_features = layers.Embedding(num_words, 64)(body_input)
13 
14 # Reduce sequence of embedded words in the title into a single 128-dimensional vector
15 title_features = layers.LSTM(128)(title_features)
16 # Reduce sequence of embedded words in the body into a single 32-dimensional vector
17 body_features = layers.LSTM(32)(body_features)
18 
19 # Merge all available features into a single large vector via concatenation
20 x = layers.concatenate([title_features, body_features, tags_input])
21 
22 # Stick a logistic regression for priority prediction on top of the features
23 priority_pred = layers.Dense(1, name='priority')(x)
24 # Stick a department classifier on top of the features
25 department_pred = layers.Dense(num_departments, name='department')(x)
26 
27 # Instantiate an end-to-end model predicting both priority and department
28 model = keras.Model(inputs=[title_input, body_input, tags_input],
29                     outputs=[priority_pred, department_pred])

Display its model structure, including three inputs and two outputs:

When compiling a model, since the model contains two outputs, you can set loss functions for the two models respectively.

1 model.compile(optimizer=keras.optimizers.RMSprop(1e-3),
2               loss=[keras.losses.BinaryCrossentropy(from_logits=True),
3                     keras.losses.CategoricalCrossentropy(from_logits=True)],
4               loss_weights=[1., 0.2])

For the readability of the code, a dictionary can be used to declare the ownership of each loss function when setting the loss function.

1 model.compile(optimizer=keras.optimizers.RMSprop(1e-3),
2               loss={'priority':keras.losses.BinaryCrossentropy(from_logits=True),
3                     'department': keras.losses.CategoricalCrossentropy(from_logits=True)},
4               loss_weights=[1., 0.2])

Training model:

 1 # Dummy input data
 2 title_data = np.random.randint(num_words, size=(1280, 10))
 3 body_data = np.random.randint(num_words, size=(1280, 100))
 4 tags_data = np.random.randint(2, size=(1280, num_tags)).astype('float32')
 5 
 6 # Dummy target data
 7 priority_targets = np.random.random(size=(1280, 1))
 8 dept_targets = np.random.randint(2, size=(1280, num_departments))
 9 
10 model.fit({'title': title_data, 'body': body_data, 'tags': tags_data},
11           {'priority': priority_targets, 'department': dept_targets},
12           epochs=2,
13           batch_size=32)

More detailed training verification guidelines: https://tensorflow.google.cn/guide/keras/train_and_evaluate

ResNet Model(toy version)

 1 inputs = keras.Input(shape=(32, 32, 3), name='img')
 2 x = layers.Conv2D(32, 3, activation='relu')(inputs)
 3 x = layers.Conv2D(64, 3, activation='relu')(x)
 4 block_1_output = layers.MaxPooling2D(3)(x)
 5 
 6 x = layers.Conv2D(64, 3, activation='relu', padding='same')(block_1_output)
 7 x = layers.Conv2D(64, 3, activation='relu', padding='same')(x)
 8 block_2_output = layers.add([x, block_1_output])
 9 
10 x = layers.Conv2D(64, 3, activation='relu', padding='same')(block_2_output)
11 x = layers.Conv2D(64, 3, activation='relu', padding='same')(x)
12 block_3_output = layers.add([x, block_2_output])
13 
14 x = layers.Conv2D(64, 3, activation='relu')(block_3_output)
15 x = layers.GlobalAveragePooling2D()(x)
16 x = layers.Dense(256, activation='relu')(x)
17 x = layers.Dropout(0.5)(x)
18 outputs = layers.Dense(10)(x)
19 
20 model = keras.Model(inputs, outputs, name='toy_resnet')
21 model.summary()

Take a look at its structure:

keras.utils.plot_model(model, 'mini_resnet.png', show_shapes=True)

Then train the model:

 1 (x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()
 2 
 3 x_train = x_train.astype('float32') / 255.
 4 x_test = x_test.astype('float32') / 255.
 5 y_train = keras.utils.to_categorical(y_train, 10)
 6 y_test = keras.utils.to_categorical(y_test, 10)
 7 
 8 model.compile(optimizer=keras.optimizers.RMSprop(1e-3),
 9               loss=keras.losses.CategoricalCrossentropy(from_logits=True),
10               metrics=['acc'])
11 
12 model.fit(x_train, y_train,
13           batch_size=64,
14           epochs=1,
15           validation_split=0.2)

Sharing layer

When it comes to the sharing layer, you will think of various models of Siamese Network series, including image classification, target tracking, feed shot target detection, etc. Knowing this kind of model, it is not difficult to understand the use of sharing layer in keras:

Declare two inputs and input them into the same layer to produce output.

 1 # Embedding for 1000 unique words mapped to 128-dimensional vectors
 2 shared_embedding = layers.Embedding(1000, 128)
 3 
 4 # Variable-length sequence of integers
 5 text_input_a = keras.Input(shape=(None,), dtype='int32')
 6 
 7 # Variable-length sequence of integers
 8 text_input_b = keras.Input(shape=(None,), dtype='int32')
 9 
10 # Reuse the same layer to encode both inputs
11 encoded_input_a = shared_embedding(text_input_a)
12 encoded_input_b = shared_embedding(text_input_b)

Extension of API: using custom layer

  tf.keras contains a wide range of layers, such as:

  • Convolutional layers: Conv1D, Conv2D, Conv3D, Conv2DTranspose
  • Pooling layers: MaxPooling1D, MaxPooling2D, MaxPooling3D, AveragePooling1D
  • RNN layers: GRU, LSTM, ConvLSTM2D
  • BatchNormalization, Dropout, Embedding, etc.

You can use the layer you want, but you can't find the layer you want. Custom layers need to inherit layers Layer class and define build and call functions:

  • The call function defines the specific forward calculation process;
  • The build function initializes the weight of each layer;
 1 class CustomDense(layers.Layer):
 2   def __init__(self, units=32):
 3     super(CustomDense, self).__init__()
 4     self.units = units
 5 
 6   def build(self, input_shape):
 7     self.w = self.add_weight(shape=(input_shape[-1], self.units),
 8                              initializer='random_normal',
 9                              trainable=True)
10     self.b = self.add_weight(shape=(self.units,),
11                              initializer='random_normal',
12                              trainable=True)
13 
14   def call(self, inputs):
15     return tf.matmul(inputs, self.w) + self.b
16 
17 
18 inputs = keras.Input((4,))
19 outputs = CustomDense(10)(inputs)
20 
21 model = keras.Model(inputs, outputs)

For serialization support in the custom layer, define a get_config method, which returns the constructor parameters of the layer instance:

 1 class CustomDense(layers.Layer):
 2 
 3   def __init__(self, units=32):
 4     super(CustomDense, self).__init__()
 5     self.units = units
 6 
 7   def build(self, input_shape):
 8     self.w = self.add_weight(shape=(input_shape[-1], self.units),
 9                              initializer='random_normal',
10                              trainable=True)
11     self.b = self.add_weight(shape=(self.units,),
12                              initializer='random_normal',
13                              trainable=True)
14 
15   def call(self, inputs):
16     return tf.matmul(inputs, self.w) + self.b
17 
18   def get_config(self):    # Add a new function to return the value in the constructor
19     return {'units': self.units}
20 
21 
22 inputs = keras.Input((4,))
23 outputs = CustomDense(10)(inputs)
24 
25 model = keras.Model(inputs, outputs)
26 config = model.get_config()   # adopt get_config To get config information
27 
28 new_model = keras.Model.from_config(
29     config, custom_objects={'CustomDense': CustomDense})

 

Tags: TensorFlow

Posted by roscor on Fri, 20 May 2022 17:57:30 +0300