[record] record the whole process of using tensorflow serving for the first time

After completing the training of the model, it should be deployed to the company's CPU server. In order to solve this problem, Google released TensorFlow (TF) Serving, hoping to solve a series of problems in the deployment of ML model to production.

This article does not introduce tensorflow serving, but only records the steps of using TF serving. Since this is the first time to use TF serving, please point out any errors in the text.

1. Use Docker to install TF serving

First, explain my native environment:

anaconda2 python2.7

One of the easiest ways to start TensorFlow Serving is to use Docker. You can refer to tensorflow's Official documents.

1.1 installing Docker

In the ubuntu terminal, run the following command to install docker

$ sudo apt-get update
$ sudo apt-get install apt-transport-https ca-certificates curl software-properties-common
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
$ sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
$ sudo apt-get update
$ sudo apt-get install docker-ce

In order to avoid adding sudo every time we run the docker command, we run the following command under the terminal:

$ sudo groupadd docker
$ sudo usermod -aG docker $USER

Then try typing docker --help to see if the installation is successful

1.2 installing TF Serving

Use the following command to install

First, download the TensorFlow Serving image and the github code

$ docker pull tensorflow/serving
$ git clone https://github.com/tensorflow/serving

Then start the TensorFlow Servin container (using the REST API port)

$ TESTDATA="$(pwd)/serving/tensorflow_serving/servables/tensorflow/testdata"
$ docker run -t --rm -p 8501:8501
-v "$TESTDATA/saved_model_half_plus_two_cpu:/models/half_plus_two"
-e MODEL_NAME=half_plus_two
tensorflow/serving &

Here we explain the meaning of each parameter:

8501: corresponding to REST API port
-Before the colon after v is the absolute address of your model (here is the model brought by tensorflow serving). After the colon, / models / is fixed, half_ plus_ The name "two" can be given at will
-e MODEL_ The name after name = is your name after / models /
Images used by tensorflow/serving
&Indicates background operation

Open a new terminal and enter the following command in the new terminal:

curl -d '{"instances": [1.0, 2.0, 5.0]}'
-X POST http://localhost:8501/v1/models/half_plus_two:predict

If returns = > {"predictions": [2.5, 3.0, 4.5]} appears, the installation of TF Serving has been successful.

2. Use your own training model

TensorFlow Serving uses the same model form as the following figure:

In other words, your model consists of two parts, a pb file and a variables folder. These two parts are placed under a numbered folder. This number represents the version of the model.
If your model is not saved in this form, don't worry first. You can convert it to this form.

TensorFlow has three models: ckpt, pb and saved_ In the form of model, the above figure is saved_model form.

My model is in the form of ckpt, as shown in the figure below. My model conversion route is from ckpt to pb, and then to saved_model form.

2.1 ckpt 2 pb

Reference in this section: the ckpt model uses this method github code Provide the CTPN model. For the model conversion code, refer to issue 328 "generate pb file" under the github.

First use show_ The ckpt () function gets the ckpt node name and parameters, and then uses ckpt_to_pb() converts ckpt model into pb model.

import os
import tensorflow as tf
from tensorflow.python import pywrap_tensorflow
from tensorflow.python.framework import graph_util

def show_ckpt():
    # Get ckpt node name and parameters
    checkpoint_path = '../models/checkpoints_mlt/ctpn_50000.ckpt'
    checkpoint_path = os.path.join(checkpoint_path)
    # Read data from checkpoint file
    reader = pywrap_tensorflow.NewCheckpointReader(checkpoint_path)
    var_to_shape_map = reader.get_variable_to_shape_map()
    # Print tensor name and values
    for key in var_to_shape_map:
        print("tensor_name: ", key)
        # print(reader.get_tensor(key))

def ckpt_to_pb():
    checkpoint_path = '../models/checkpoints_mlt/ctpn_50000.ckpt'
    output_graph = './model_pb/ctpn.pb'
    output_node_names = 'model_0/bbox_pred/Reshape_1,model_0/cls_prob' # Two output nodes
    with tf.Session(config=tf.ConfigProto(allow_soft_placement=True, log_device_placement=True)) as sess:
        saver = tf.train.import_meta_graph(checkpoint_path + '.meta', clear_devices=True)
        saver.restore(sess, checkpoint_path)
        graph = tf.get_default_graph()
        input_graph_def = graph.as_graph_def()
        output_graph_def = graph_util.convert_variables_to_constants(
        with tf.gfile.GFile(output_graph, 'wb') as fw:
        print('{} ops in the final graph.'.format(len(output_graph_def.node)))

if __name__ == '__main__':

2.2 pb 2 saved_model

This part of the code refers to: https://zhuanlan.zhihu.com/p/103131661.

import tensorflow.compat.v1 as tf
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.saved_model import tag_constants

export_dir = './model_pb/saved_model/0000003'
graph_pb = './model_pb/ctpn.pb'

builder = tf.saved_model.builder.SavedModelBuilder(export_dir)

with tf.gfile.GFile(graph_pb, "rb") as f:
    graph_def = tf.GraphDef()

sigs = {}

with tf.Session(graph=tf.Graph()) as sess:
    # name="" is important to ensure we don't get spurious prefixing
    tf.import_graph_def(graph_def, name="")
    g = tf.get_default_graph()

    inp = g.get_tensor_by_name("input_image:0")  # Enter node name
    input_im_info = tf.placeholder(tf.float32, shape=[None, 3], name='input_im_info')  # Enter node name
	# Two output nodes
    output_cls_prob = sess.graph.get_tensor_by_name('model_0/cls_prob:0')
    output_box_pred = sess.graph.get_tensor_by_name('model_0/bbox_pred/Reshape_1:0')

    # out = [output_cls_prob, output_box_pred]

    sigs[signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY] = \
            inputs={"in": inp, 'info': input_im_info},
            outputs={"out_cls": output_cls_prob, 'out_box': output_box_pred})



After executing this part of the code, the following model will be generated, but the variables folder is empty, because when converting from the pb file, it is all constants without variables. It doesn't matter. It doesn't affect the use.

At this point, let's try to start docker TF Serving and see the effect. Enter the following command in the terminal:

docker run -t --rm -p 8501:8501 -v /your_path/model_pb/saved_model:/models/ctpn_test_model -e MODEL_NAME=ctpn_test_model tensorflow/serving

Note: ① when saved_ When there are multiple versions of models under model, it will automatically select the model with the largest version number. ② The storage path of the model needs to be an absolute path. ③saved_ There must be a version number folder under the model folder, and the model must be placed in the version number folder, otherwise an error will be reported.

The execution result in the terminal is:

At this time, you can enter in the browser address bar: http://localhost:8501/v1/models/ctpn_test_model , the result is shown as:

input http://localhost:8501/v1/models/ctpn_test_model/metadata , you can check your model:

3. Pass parameters to docker TF Serving for prediction

The link referenced by the code in this section cannot be found. If you are original, please contact me and I will add a reference.
The model I use is the CTPN model. As you can see from the above code, the model accepts two inputs "input_image" and "input"_ im_ Info 'to generate two outputs:' model '_ 0/cls_ Prob 'and' model_0/bbox_pred/Reshape_1’.

import cv2
import json
import numpy as np
import requests

def resize_image(img):
    img_size = img.shape
    im_size_min = np.min(img_size[0:2])
    im_size_max = np.max(img_size[0:2])

    im_scale = float(600) * 1.0 / float(im_size_min)
    if np.round(im_scale * im_size_max) > 1000:
        im_scale = float(1000) * 1.0 / float(im_size_max)
    new_h = int(img_size[0] * im_scale)
    new_w = int(img_size[1] * im_scale)

    new_h = new_h if new_h // 16 == 0 else (new_h // 16 + 1) * 16
    new_w = new_w if new_w // 16 == 0 else (new_w // 16 + 1) * 16

    re_im = cv2.resize(img, (new_w, new_h), interpolation=cv2.INTER_LINEAR)
    return re_im, (new_h * 1.0 / img_size[0], new_w * 1.0 / img_size[1])

# Get input data
file_path = '../main/data/demo/business_license_mask.jpeg'
im = cv2.imread(file_path)[:, :, ::-1]
img, (rh, rw) = resize_image(im)
img = np.expand_dims(img, axis=0)
print('img.shape:', img.shape)

img = img.astype('float16')
n, h, w, c = img.shape

im_info = np.array([h, w, c]).reshape([1, 3])

payload = {
    "inputs": {'info': im_info.tolist(), 'in': img.tolist()}

# sending post request to TensorFlow Serving server
r = requests.post('http://localhost:8501/v1/models/ctpn_test_model:predict', json=payload)
pred = json.loads(r.content.decode('utf-8'))


jsObj = json.dumps(pred)

fileObject = open('./pred.json', 'w')

Run the code and you can see that the result of the first run is error. You can go/ pred. Check the specific error causes in JSON. The last run was successful, and the results are placed in/ pred.json, you can view it yourself.

So far, the whole process of using tensorflow serving for the first time has ended.

Tags: TensorFlow

Posted by diddy1234 on Wed, 18 May 2022 10:59:17 +0300