Hisilicon development: mobilefacenet model pytorch - > onnx - > caffe - > nnie

1, Foreword

Recently, when I am free, I will sort out and record the previous projects to benefit the people.

2, Details

The face model is trained under pytorch, and the project file uses this: MobileFaceNet_Tutorial_Pytorch
After the training, first convert to onnx model and simplify it. The code is as follows:

def export_onnx():
    import onnx
    parser = argparse.ArgumentParser()
    #parser.add_argument('--weights', type=str, default=r'F:\demo\Pytorch_demo\yolov5_fruit\weights\fruit_last.pt', help='weights path')
    parser.add_argument('--img-size', nargs='+', type=int, default=[112, 112], help='image size')
    parser.add_argument('--batch-size', type=int, default=1, help='batch size')
    opt = parser.parse_args()
    print(opt)

    # Parameters
    f = "zeng_mobileface_model\zeng_insightface.onnx"
    sim_onnx_path = "zeng_mobileface_model\zeng_insightface_sim.onnx"
    img = torch.zeros((opt.batch_size, 3, *opt.img_size))  # image size, (1, 3, 320, 192) iDetection

    # Load pytorch model
    #google_utils.attempt_download(opt.weights)
    #model = torch.load(opt.weights, map_location=torch.device('cpu'))['model'].float()

    #model.eval()
    device = "cpu"
    model = MobileFaceNet(512).to(device)  # embeding size is 512 (feature vector)
    model.load_state_dict(
        torch.load('F:\demo\Pytorch_demo\MobileFaceNet_Tutorial_Pytorch\zeng_mobileface_model\Iter_455000_model.ckpt',
                   map_location='cpu')['net_state_dict'])
    print('MobileFaceNet face detection model generated')

    model.eval()

    _ = model(img)  # dry run
    torch.onnx.export(model, img, f, verbose=False, opset_version=11, input_names=['images'],
                      output_names=['output'])  # output_names=['classes', 'boxes']

    # Check onnx model
    model = onnx.load(f)  # load onnx model
    onnx.checker.check_model(model)  # check onnx model
    print(onnx.helper.printable_graph(model.graph))  # print a human readable representation of the graph
    print('Export complete. ONNX model saved to %s\nView with https://github.com/lutzroeder/netron' % f)
    model_simp, check = simplify(model)
    onnx.save(model_simp, sim_onnx_path)
    print("Simplify onnx done !")
    assert check, "Simplified ONNX model could not be validated"
    import os
    os.remove(f)

1. onnx conversion

Since pytorch has an onnx export interface, it will be easy to export onnx. Who knows that there will be some strange OPS after each simplification. After communicating with friends, it is found that the original mobilefacenet network has made an L2 norm normalization before output.

As shown in the figure above, the two black things, at first I thought there was something wrong with onnx simplification code, and repeated simplification is still the same.

# Only part of the code was intercepted, face_ model. Py mobilefacenet class, under the forward function
out = self.conv_6_dw(out)

out = self.conv_6_flatten(out)

out = self.linear(out)

out = self.bn(out)
        
#return l2_norm(out)  # Comment line

return out  # Output such a value

Then run the conversion onnx code, and then output it.

2. TypeError: ONNX node of type PRelu is not supported.

This problem is quite tricky with the following one. There are not many relevant blogs on the Internet, or my search method is wrong. The closest one is this one How to make onnx2caffe support the conversion of prelu layer? , I suggest you take a look at this first to find out which files need to be changed. Anyway, it took me all morning to fix it. The reason for the problem is,

_ONNX_NODE_REGISTRY = {
    "Conv": _convert_conv,
    "Relu": _convert_relu,
    "PRelu": _convert_prelu,  # I added the one I didn't have before
    "BatchNormalization": _convert_BatchNorm,
    "Add": _convert_Add,
    "Mul": _convert_Mul,
    "Reshape": _convert_Reshape,
    "MaxPool": _convert_pool,
    "AveragePool": _convert_pool,
    "Dropout": _convert_dropout,
    "Gemm": _convert_gemm,
    "MatMul": _convert_matmul,  # I added the one I didn't have before
    "Upsample": _convert_upsample,
    "Concat": _convert_concat,
    "ConvTranspose": _convert_conv_transpose,
    "Sigmoid": _convert_sigmoid,
    "Flatten": _convert_Flatten,
    "Transpose": _convert_Permute,
    "Softmax": _convert_Softmax,
}

Since there is no pre Lu op in the above OP dictionary, an error is reported. The reason for improvement is simple. Add relevant processing functions yourself. However, there is a premise to note that caffe should support these ops. Otherwise, if it is changed here, caffe will still report an error. In the link above, the main answer is that its caffe version does not support prelu well.
Two specific changes are needed:
(1). In the directory of onnx2caffe source package, open the onnx2caffe subdirectory and open the_ operators.py. In_ operators.py, find the dictionary above and add a line of "PRelu":_ convert_prelu ;

Then turn it up, or_ operators.py file, add a function, it is recommended to_ convert_ The relu (node, graph, ERR) function is added below to facilitate copying and error checking. The function implementation is as follows:

# This function is only the conversion of layer name and does not involve parameter replication, so it is a basic reference_ convert_ Implementation of relu (node, graph, ERR) function
def _convert_prelu(node, graph, err):
    input_name = str(node.inputs[0])
    output_name = str(node.outputs[0])
    name = str(node.name)

    if input_name == output_name:
        inplace = True
    else:
        inplace = False

    layer = myf("PReLU", name, [input_name], [output_name], in_place=inplace)
    # l_top_relu1 = L.ReLU(l_bottom, name=name, in_place=True)

    graph.channel_dims[output_name] = graph.channel_dims[input_name]

    return layer

(2). In the same directory, open_ weightloader.py file, turn to the bottom, and add a line "PRelu" to the op Dictionary:_ convert_prelu ;
Then turn it up, or is it_ convert_ Add a processing function under the relu (node, graph, ERR) function, as follows:

def _convert_prelu(net, node, graph, err):
 
    weight = node.input_tensors[node.inputs[1]]
   
    # copy weight to caffe model
    shape = weight.shape 
    # Because prelu in onnx is a three-dimensional array, such as (64, 1, 1), while prelu in caffe is a one-dimensional array, such as (64,)
    # Therefore, you should reshape, otherwise you will report an error
    weight = weight.reshape((shape[0])) 
    np.copyto(net.params[node.name][0].data, weight, casting='same_kind') # Copy parameters to caffe model

After that, we continue to transform the model, and there are new errors. Let's talk about them below.

3.TypeError: ONNX node of type MatMul is not supported.

After the above explanation, you should know how to change this problem. I'll briefly mention it in_ operators.py and_ weightloader.py add "MatMul" at the bottom Dictionary:_ convert_matmul,
Then in_ operators.py and_ weightloader. Add two functions to py:

# Note: This is in_ operators.py file
def _convert_matmul(node, graph, err):

    node_name = node.name
    input_name = str(node.inputs[0])
    output_name = str(node.outputs[0])
    weight_name = node.inputs[1]
    if weight_name in node.input_tensors:
        W = node.input_tensors[weight_name]
    else:
        err.missing_initializer(node,
                                "MatMul weight tensor: {} not found in the graph initializer".format(weight_name, ))
        return
    b = None
    bias_flag = False
    if len(node.inputs) > 2:
        b = node.input_tensors[node.inputs[2]]

    if len(W.shape) != 2 or (b is not None and len(b.shape) != 1):
        return err.unsupported_op_configuration(node, "MatMul is supported only for inner_product layer")
    if b is not None:
        bias_flag = True
        if W.shape[0] != b.shape[0]:
            return err.unsupported_op_configuration(node,
                                                    "MatMul is supported only for inner_product layer")

    layer = myf("InnerProduct", node_name, [input_name], [output_name], num_output=W.shape[0], bias_term=bias_flag)
    graph.channel_dims[output_name] = W.shape[0]

    return layer
# Note: This is in_ weightloader.py file
def _convert_matmul(net, node, graph, err):
    node_name = node.name
    weight_name = node.inputs[1]
    if weight_name in node.input_tensors:
        W = node.input_tensors[weight_name]
    else:
        err.missing_initializer(node,
                                "MatMul weight tensor: {} not found in the graph initializer".format(weight_name, ))
    b = None
    if len(node.inputs) > 2:
        b = node.input_tensors[node.inputs[2]]
    if len(W.shape) != 2 or (b is not None and len(b.shape) != 1):
        return err.unsupported_op_configuration(node, "MatMul is supported only for inner_product layer")
    if b is not None:
        if W.shape[0] != b.shape[0]:
            return err.unsupported_op_configuration(node, "MatMul is supported only for inner_product layer")
    net.params[node_name][0].data[...] = W

After that, compile again without error, and the caffe model file is successfully generated. Verify the accuracy of caffe and run an LFW. The results are as follows:

#                      Acc        Threshold
pytorch Original model: 	  99.433     0.635(LFW)
                  
Converted caffe Model old: 97.983     0.670  # Error in the first conversion result

Don't be happy too early. The first conversion result is the following line. It can be found that the accuracy is a little poor, indicating that there must be an error in the conversion process! Began to painfully find the reason.

4. Inconsistent output results

Select a graph, input the pytorch model, and print the output results that are not normalized by L2 norm. Use the same image preprocessing code to process the same image, input the onnx model, and refer to this in the demo: onnx reasoning demo . comparing the output results of the two, it is found that there is no difference, indicating that onnx is OK, and the problem is the onnx - > Caffe process.

Using the same picture, input the caffe model and print the final output results. It is found that there are differences between the two. Then print the input data and confirm whether they are consistent. It is found that they are also consistent, which indicates that the input data is OK. Then jump to the middle, confirm the network intermediate output, and so on, and use the dichotomy to find the network problem points. Finally, it is determined that the problem node is the matmul layer (in fact, I can guess that it has a problem. After all, I wrote the conversion code myself). View the document of onnx matmul layer,

It's normal, that is, the ordinary matrix multiplication op, which itself says that the operation is the same as numpy's matmul. Then check whether the parameter values of onnx model and caffe model are consistent. It is found that the parameter values of onnx model and caffe model are also consistent. What is the problem? Did I miss anything?

Looking up the introduction of the InnerProduct layer of caffe, it is found that the InnerProduct of caffe will first transpose the weight array and then multiply it with the input data, while onnx does not transpose and multiplies it directly, that is:

ONNX :  X * W = output 
Caffe : X * W.T = output'

This is the problem. caffe transposes and multiplies the weight array, so the output data is wrong. In order to deal with this situation, it should be as follows:

ONNX :  X * W = output 
W' = W.T
Caffe : X * W'.T = X * (W.T).T = output

So you have to make some modifications to the code. Find the directory of onnx2caffe source package, open the onnx2caffe subdirectory, and open_ weightloader.py file, found_ convert_matmul function, modify the last behavior:

#net.params[node_name][0].data[...] = W  # This sentence is commented out
net.params[node_name][0].data[...] = W.transpose() # Do transpose first, and then transpose in caffe to return to the origin

Run the data set again and compare the results:

#                      Acc        Threshold
pytorch Original model: 	  99.433     0.635(LFW)
                             
Converted caffe Model new: 99.433     0.635  # Final conversion result, available

It can be found that the transformation is finally correct!

3, Summary

Model transformation is a big pit. We need to start from the bottom and understand its principle in order to make correct operation. In addition, I'm still too good to take things for granted, otherwise I won't be stuck all day.

Tags: neural networks Pytorch Deep Learning caffe onnx

Posted by richmlpdx on Wed, 04 May 2022 03:59:45 +0300