Using Docker to run HuggingFace massive model

This article will share how to quickly run the interesting model on Hugging Face locally through Docker. Run the model with less code and less time cost than the original project.

If you are familiar with Python, most model projects can be deployed and run locally in about 10 minutes.

Write in front

In order to facilitate the display, I chose an image processing model. Before talking about the details, let's take a look at the actual operation effect of this model project.

The AI model used in the above image processing was found on Hugging Face. With the popularity of Hugging Face, there are more and more interesting models and data sets on the platform. At present, the number of models alone is as high as more than 45000.

These models have an interesting feature. They run well on the cloud platform, but once they want to run locally, they have to struggle. User feedback can always be seen in the GitHub associated with the project: I can't run this model and code locally, and the running environment and calling code are too troublesome.

In fact, in our daily work and study, we often encounter situations similar to the above Hugging Face: many models run well in the "cloud", but they can't run locally. This may be because of "differences in operating system environment and device CPU architecture (x86/ ARM)", perhaps because "Python runtime version is too high or too low", or "the software package version installed by a PIP is wrong" "A pile of things are written in the lengthy sample code"

So, is there any way to get around these time wasting problems?

After some twists and turns, I explored a relatively reliable scheme: using Docker container with Towhee to make a one click running environment for the model.

For example, the model mentioned at the beginning of this article, if we want to make a quick call and carry out a quick repair process for our pictures, it is really not difficult: we only need a docker run command with 20 or 30 lines of Python code.

Next, I will take the open source GFPGAN model of Tencent ARC laboratory as an example to talk about how to quickly run the online open model.

Because the model is based on PyTorch, in this article, we will first talk about how to make a general Docker basic image used by the model based on PyTorch. If the students have needs, I will talk about other model frameworks.

General Docker basic image used to make PyTorch model

I have uploaded the complete sample code of this chapter to GitHub: , interested students can take it by themselves. If you want to further save trouble, you can also directly use the image I have built as the basic image: .

If you are interested in how to encapsulate the basic image, you can continue to read this chapter. If you only care about how to run the model quickly, you can read the next chapter directly.

To get back to business, I suggest that students who want to quickly reproduce the model locally adopt the container scheme for the following three reasons:

  1. Want to avoid environmental interference (pollution) between different projects
  2. To ensure that project dependencies are clear, anyone can reproduce results on any device
  3. If you want to reproduce the model, the time cost will be lower. You don't like to toss about 80% of the repeated work outside the model tuning (especially the environment and basic configuration)

After understanding the advantages of the container scheme. Next, let's talk about how to write dockerfiles of such basic images and the thinking in the writing process:

Considering that the model may need to run on x86 and ARM devices, minicanda3, a basic image based on debian with built-in conda toolkit, is recommended.

FROM continuumio/miniconda3:4.11.0

As for the use of basic environment image, I recommend that you use the specific version number instead of latest, which can keep your container "stable" and reduce "unexpected surprises" when you need to build repeatedly. If you have special version requirements, you can here Find a better mirror version for you. For the content related to conda and mini conda, this article will not repeat it for the time being. Interested students can Official warehouse For more information. If there is a need, I will write a more detailed article to talk about it.

Because we often use OpenGL API, we need to install libgl1 mesa GLX package in the basic image. If you want to know more about this package, you can read debian official software warehouse documentation , in order to reduce the installation time, I adjusted the software source to the domestic "Tsinghua source".

RUN sed -i -e "s/" /etc/apt/sources.list && \
    sed -i -e "s/" /etc/apt/sources.list && \
    apt update
RUN apt install -y libgl1-mesa-glx

After completing the installation of the basic system dependency library, we can start to prepare the model running environment. Take PyTorch installation as an example:

RUN pip config set global.index-url
RUN conda install -y pytorch

Similarly, in order to save the download time of Python PyPi package, I also switched the download source to the domestic "Tsinghua source". After the conda install -y pytorch command is executed, our basic running environment will be OK.

Considering the different network environments, here are some other commonly used image sources in China. You can adjust the package download source according to your own situation to obtain faster package download speed.

# Tsinghua source
# Alibaba cloud
# Baidu
# China University of science and technology
# Watercress

In the above steps, we need to download nearly 200MB software packages (conda 14MB, pytorch 44MB, mkl 140MB). We need to be patient.

In order to make our basic image environment compatible with x86 and ARM, in addition to completing the above basic environment installation, we also need to specify the torch and torchvision versions, which have been reported in the PyTorch community Some discussion.

RUN pip3 install --upgrade torch==1.9.0 torchvision==0.10.0

In the above command, we will replace torch with the specified version. In the actual process of building an image, you need to download an additional 800MB of data. Even if we use the domestic software source, the time may be relatively long. We can consider taking a can of ice Cola in the refrigerator to alleviate the anxiety of waiting. πŸ₯€

After dealing with the above dependencies, we come to the last step of building an image. In order to make it easier to run various PyTorch models in the future, it is recommended to install Towhee directly in the basic image:

RUN pip install towhee

So far, the Dockerfile of a general Docker basic image used by the PyTorch based model has been written. For ease of reading, I post the complete file content here:

FROM continuumio/miniconda3:4.11.0

RUN sed -i -e "s/" /etc/apt/sources.list && \
    sed -i -e "s/" /etc/apt/sources.list && \
    apt update
RUN apt install -y libgl1-mesa-glx

RUN pip config set global.index-url
RUN conda install -y pytorch

RUN pip3 install --upgrade torch==1.9.0 torchvision==0.10.0

RUN pip install towhee

After saving the above contents as Dockerfile, execute docker build - t soulteary / docker PyTorch playground, When the command is executed, our PyTorch basic image will be built.

If you don't want to waste time building, you can also directly use the basic image I have built (supporting automatic differentiation of x86 / ARM architecture devices) and download it directly from DockerHub:

# You can download the latest version directly
docker pull soulteary/docker-pytorch-playground
# You can also use a mirror with a specific version
docker pull soulteary/docker-pytorch-playground:2022.05.19

After completing the basic image, we can continue to toss the running environment and program of the specific model mentioned above.

Write a model caller in Python

We can find official model usage examples in GFPGAN project: , the original file is relatively long, about 155 lines. I won't post it here.

As I mentioned in the previous section, we can use Towhee to "be lazy". For example, we can shorten the number of lines of sample code to 30 lines, and realize an additional small function: scan all the pictures in the working directory, then hand them to the model for processing, and finally generate a static page to compare and display the pictures before and after processing.

import warnings
warnings.warn('The unoptimized RealESRGAN is very slow on CPU. We do not use it. '
              'If you really want to use it, please modify the corresponding codes.')

from gfpgan import GFPGANer
import towhee

class GFPGANerOp:

    def __init__(self,
                 bg_upsampler=None) -> None:
        self._restorer = GFPGANer(model_path, upscale, arch, channel_multiplier, bg_upsampler)

    def __call__(self, img):
        cropped_faces, restored_faces, restored_img = self._restorer.enhance(
            img, has_aligned=False, only_center_face=False, paste_back=True)

        return restored_faces[0][:, :, ::-1]

        .image_load['path', 'img']() 
        .show(formatter=dict(img='image', face='image'))

If you eliminate the warnings above in order to maintain the "original flavor", you can actually get a shorter number of lines. Save the above content as app Py, we'll use it later.

After we have finished the program required to call the model, let's continue to talk about how to make the application container image required for the operation of the specific model (GFPGAN).

Application image used to make specific model

I also uploaded the complete code of this part to GitHub for your "laziness": . The supporting pre built image is here .

To get back to business, with the basic image mentioned above, we only need to make some image dependency fine-tuning for each different model in the process of daily play.

Let's take a look at how to customize the application image for the GFPGAN project mentioned above.

Similarly, taking the writing of Dockerfile as an example, let's first declare that the application image we are building is based on the above basic image.

FROM soulteary/docker-pytorch-playground:2022.05.19

The advantage of this is that in subsequent daily use, we can save a lot of image construction time and local disk space. It has to be said that large model containers can particularly enjoy the convenience brought by the Docker feature.

Next, we need to place the model files we want to use in the application image to be made, and complete the supplementary download of relevant Python dependencies.

Considering the slow downloading of Hugging Face and GitHub models on domestic networks, it is also prone to network interruption. I recommend that you consider downloading the dependent model in advance when building the application model. In the process of building the image, you can place the model in the appropriate directory location. As for the specific mode of use of the model, whether it is packaged into the image or dynamically mounted in the process of use, it is OK.

In the GFPGAN project, we rely on two model files. One is In the project, the face detection model based on ResNet50, and the other is the GFPGAN confrontation network model for image restoration, that is, the "protagonist" in the traditional sense.

First model file detection_Resnet50_Final.pth, we can Obtained from; The second model requires us to make specific choices according to our own equipment conditions:

After placing the downloaded model file and the new Dockerfile in the same directory, we will continue to improve the content of Dockerfile, complete the installation of project dependencies, and place the model in the appropriate directory in the container:

# Install model related code base
RUN pip install gfpgan realesrgan
# Copy the model downloaded in advance to the specified location to avoid accidents in the process of building the image
COPY detection_Resnet50_Final.pth /opt/conda/lib/python3.9/site-packages/facexlib/weights/detection_Resnet50_Final.pth

# Choose a model file according to the model version you downloaded
COPY GFPGANCleanv1-NoCE-C2.pth /GFPGAN.pth
# COPY GFPGANCleanv1-NoCE-C2_original.pth /GFPGAN.pth
# COPY GFPGANv1.3.pth /GFPGAN.pth

In addition to gfpgan, I also installed realesrgan. This software package can make the background outside the face in the processed picture look better and more natural.

After completing the configuration of basic dependencies and models, the last step is to finish some simple work:

# Copy the program calling the model saved in the previous step to the image

# Declare a clean working directory
# Here we can consider directly throwing the data set we want to test into the container
# You can also consider dynamic mounting during operation
# COPY imgs/*.jpg ./

# Supplement other dependencies required for the installation of some projects
RUN pip install IPython pandas
# Because Towhee currently only supports direct display of model results
# Saving display results as files is not supported yet
# So we need to make a small patch to make it support this function
RUN sed -i -e "s/display(HTML(table))/with open('result.html', 'w') as file:\n            file.write(HTML(table).data)/" /opt/conda/lib/python3.9/site-packages/towhee/functional/mixins/
CMD ["python3", "/"]

In the above code, I added a lot of comments to explain what to do at each step, so I won't repeat it. For an additional explanation of the design and thinking here, put the app Py moving to the / root directory instead of throwing it into the working directory can make our program easier to use, because I plan to use the working directory as the storage directory for image reading and processing results. Finally, the container uses CMD instead of ENTRYPOINT to execute the default command, which is more convenient for users to call the command directly or enter the container for debugging.

Similarly, in order to facilitate reading, I combine the Dockerfile contents above:

FROM soulteary/docker-pytorch-playground:2022.05.19

RUN pip install gfpgan realesrgan
COPY detection_Resnet50_Final.pth /opt/conda/lib/python3.9/site-packages/facexlib/weights/detection_Resnet50_Final.pth

# For larger model files, you can choose to mount them
# Instead of directly COPY to the inside of the container here
COPY GFPGANCleanv1-NoCE-C2.pth /GFPGAN.pth

RUN pip install IPython pandas
RUN sed -i -e "s/display(HTML(table))/with open('result.html', 'w') as file:\n            file.write(HTML(table).data)/" /opt/conda/lib/python3.9/site-packages/towhee/functional/mixins/
CMD ["python3", "/"]

After saving the above contents as Dockerfile, we execute the command to complete the construction of the application image:

docker build -t pytorch-playground-gfpgan -f Dockerfile .

After a moment, we will get an application image containing the model and the program running the model.

Next, let's see how to use this image to get the model running results at the beginning of the article.

Use of model application image

If you have downloaded the model file in the previous step and packaged the model file into the image, we only need to download some black-and-white or color pictures containing human images (selected according to the model), put them in a directory (such as data directory), and then execute one line of command to complete the call of the model:

docker run --rm -it -v `pwd`/data:/data soulteary/docker-gfpgan 

If you don't want to bother looking for pictures, you can also directly use the sample pictures I prepared in the project:

The above is for the case that the application image contains models. Let's see what to do if the application image does not contain models.

If we did not choose to package the GFPGAN model into the image when building the application model image above, we need to use the file mount method to run the model. In order to clarify the project structure, I created a directory called model in the project to store the model files mentioned above.

The complete directory structure is similar to the following:

β”œβ”€β”€ data
β”‚   β”œβ”€β”€ Audrey\ Hepburn.jpg
β”‚   β”œβ”€β”€ Bruce\ Lee.jpg
β”‚   β”œβ”€β”€ Edison.jpg
β”‚   β”œβ”€β”€ Einstein.jpg
β”‚   └── Lu\ Xun.jpg
└── model
    └── GFPGANCleanv1-NoCE-C2.pth

After preparing the model and the pictures to be processed, we still execute a simple command to mount the file into the container to make the model play its "magic":

docker run --rm -it -v `pwd`/model/GFPGANCleanv1-NoCE-C2.pth:/GFPGAN.pth -v `pwd`/data:/data soulteary/docker-gfpgan 

After the command is executed, one more result will appear in the data directory HTML file, which records the picture results before and after model processing. Directly open the browser and you can see the following results:

How to quickly encapsulate the basic image model of pyarch is described here. If I have a chance later, I will talk about how to further optimize the performance based on these images and about image encapsulation other than PyTorch.


To complete this article, we need to thank two good friends @ Hou Jie and @ Guo rentong, the core developers of Towhee project, for their help. It solves the very troublesome content of model call for me, a novice Python, although there are not many lines.

In the next related content, I plan to talk about how to conduct model training and reasoning on M1 equipment, and continue to practice some more interesting AI projects.


This article uses the "signature 4.0 International (CC BY 4.0)" license agreement. You are welcome to reprint, modify or use it again, but you need to indicate the source. Signature 4.0 International (CC BY 4.0)

Author: Su Yang

Created on: May 20, 2022
Statistics: 10723 words
Reading time: 22 minutes
Link to this article:

Tags: Python Docker Pytorch

Posted by besa on Sat, 21 May 2022 01:07:42 +0300