darknet command interpretation

 

https://blog.csdn.net/u010122972/article/details/83541978

  • advantage
    Darknet is a relatively small in-depth learning framework. There is no community and it is mainly maintained by the author team. Therefore, the promotion is weak and few people use it. Moreover, due to the limited maintenance personnel, the functions are not as powerful as those of tensorflow and other frameworks, but the framework still has some unique advantages:
    1. Easy to install: select the additional items you need (cuda, cudnn, opencv, etc.) in the makefile and make it directly. The installation can be completed in a few minutes;
    2. There are no dependencies: the whole framework is written in C language and can not rely on any library. Even the opencv author has written functions that can replace it;
    3. The structure is clear and the source code is easy to view and modify: the basic files of the framework are in the src folder, while some detection and classification functions defined are in the example folder, which can directly view and modify the source code as needed;
    4. Friendly Python interface: Although darknet is written in c language, it also provides a python interface. Through Python functions, you can use Python to directly test the trained Call the model in weight format;
    5. Easy to transplant: the framework is very simple to deploy to the local machine, and can use cpu and gpu according to the situation of the machine, especially the local deployment of detection and identification tasks. darknet will be very convenient.

  • Code structure
    The following figure shows the distribution of folders after downloading and decompressing the darknet source code:

    1. The CFG folder contains the schema of some models. Each cfg file is similar to the prototype XT file of caffe, which defines the schema of the whole model
    2. Some label files are placed in the data folder, such as the category name of coco9k, and some sample diagrams (this folder is mainly for demonstration, or it is useful for directly training the corresponding data sets such as coco. If we want to use our own data for self-training, we don't need anything in this folder)
    3. The SRC folder is full of the lowest level framework definition files, and the most basic functions such as the definitions of all layers are all in this folder. It can be understood that this folder is the source code of the framework;
    4. The examples folder is a higher-level function, such as detection function and identification function. These functions directly call the underlying functions. We often use the functions in example;
    5.include folder, as the name suggests, is the place where header files are stored;
    6. In the python folder are the calling methods of the model using python, which are basically in Darknet Py. Of course, to implement Python calls, You also need to use the dynamic library lib darknet so , this dynamic library will be introduced later;
    7. There are some scripts in the scripts folder, such as downloading coco data sets, converting data sets in voc format into scripts in the format required for training, etc
    8. In addition to the license file, the rest is the Makefile file. As shown in the figure below, there are some options at the beginning of the price inquiry. Set the options you need to use to 1

  • install
    1. Click Makefile and set the required options to 1, as shown in the figure, using GPU and CUDNN

    2. Open the terminal, enter the root directory of darknet folder, enter make and start compiling
    3. After the compilation is completed in a few minutes, there will be more folders and files in the folder. The obj file stores the files in the compilation process o file and several other empty folders don't need much attention. The most important ones here are three: the exe file named darknet and libdarknet A's static link library and name is libdarknet Dynamic link library of so. If you directly try to call the model locally, you can directly run the exe file darknet. If you need to migrate the call, you need to use libdarknet So is a dynamic link library. This dynamic link library only contains the framework basic functions defined in the src folder, not the high-level functions in examples, so you need to define your own detection functions during the call process

  • testing
    Run the following code

 

./darknet detector test data/detect.data data/yolov3.cfg data/yolov3.weight

 

  • 1

 

Among them/ Darknet represents the Darknet generated by running compilation Exe file, darknet.exe Exe first calls darknet.exe under the example folder c. The main function in this file requires predefined parameters, and the detector is the predefined parameter, as shown in the following code

 

else if (0 == strcmp(argv[1], "detector")){
        run_detector(argc, argv);

 

  • 1
  • 2

 

Call run instead of 'detector'_ detector,run_ The detector exists in the detector under the example folder C, and then determine whether to call the detection function, training function or verification function according to the predefined parameters:

 

if(0==strcmp(argv[2], "test")) test_detector(datacfg, cfg, weights, filename, thresh, hier_thresh, outfile, fullscreen);
else if(0==strcmp(argv[2], "train")) train_detector(datacfg, cfg, weights, gpus, ngpus, clear);
else if(0==strcmp(argv[2], "valid")) validate_detector(datacfg, cfg, weights, outfile);
else if(0==strcmp(argv[2], "valid2")) validate_detector_flip(datacfg, cfg, weights, outfile);
else if(0==strcmp(argv[2], "recall")) validate_detector_recall(cfg, weights);
else if(0==strcmp(argv[2], "demo")) 

 

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6

 

test means detection, train means training, valid means verification, recall means testing its recall rate, and demo means calling the camera for real-time detection

 

The last three parameters of the command represent the files required for operation The data file records the category and class name of model detection, as follows:

 

classes= 1
train  = /media/seven/yolov3/data/plate2/train.list
#valid = data/coco_val_5k.list
names = data/plate/plate.names
backup = /media/seven/yolov3/data/plate2/models
#eval=coco

 

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6

 

class indicates the detection category, train is the list of training data needed in training, valid is the list of verification sets, names is the name of the detection category, and backup is the path used in training to store the training model

 

The. cfg file defines the model structure, while The weight file is the called model weight file

 

Run the above command and you will get the following prompt on the terminal:

 

Enter Image Path: 

 

  • 1

 

Directly input the path of the image in the terminal, you can detect the image, and generate a name called predictions in the root directory of darknet Png test results, as shown in the figure:

 

  • classification
    The classification is similar to the detection, and the call command is as follows:

 

./darknet classifier predict classify.data classify.cfg classify.weights

 

  • 1

 

The same is true for testing/ Run Darknet Exe and call darknet.exe in example C file, call classifier through classifier Run in C_ Classifier function:

 

else if (0 == strcmp(argv[1], "classifier")){
        run_classifier(argc, argv);

 

  • 1
  • 2

 

And then call predict through predict_classifier function:

 

 if(0==strcmp(argv[2], "predict")) predict_classifier(data, cfg, weights, filename, top);
 else if(0==strcmp(argv[2], "fout")) file_output_classifier(data, cfg, weights, filename);
 else if(0==strcmp(argv[2], "try")) try_classifier(data, cfg, weights, filename, atoi(layer_s));
 else if(0==strcmp(argv[2], "train")) train_classifier(data, cfg, weights, gpus, ngpus, clear);
 else if(0==strcmp(argv[2], "demo")) demo_classifier(data, cfg, weights, cam_index, filename);
 ...

 

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6

 

And classify data,classify.cfg,classify.weights denote the weight corresponding to the classification Data file, model definition, cfg file and model weight weights file.

 

  • train

 

  1. Training of detection model:
    (1) Data preparation:
    First, you need to convert the ground truth of the data into the format required by darknet. If your gt is xml in voc format, you can convert it through the following script

 

import xml.etree.ElementTree as ET
import pickle
import os
from os import listdir, getcwd
from os.path import join
classes = ["plate"]#Change all categories to your own needs
def convert(size, box):
    dw = 1./size[0]
    dh = 1./size[1]
    x = (box[0] + box[1])/2.0
    y = (box[2] + box[3])/2.0
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x*dw
    w = w*dw
    y = y*dh
    h = h*dh
    return (x,y,w,h)
def convert_annotation(image_id):
    in_file = open(xml_path)#The address of the xml file corresponding to the picture
    out_file = open(txt_save_path,'w') #The converted txt corresponding to this xml, and the full path to save this txt
    tree=ET.parse(in_file)
    root = tree.getroot()
    size = root.find('size')  #Access the data of size tag
    w = int(size.find('width').text)#Read the width data in the size tag
    h = int(size.find('height').text)#Read the height data in the size tag

    for obj in root.iter('object'):
        cls = obj.find('name').text
        if cls not in classes :#or int(difficult) == 1:
            continue
        cls_id = classes.index(cls)
        xmlbox = obj.find('bndbox')   #Access and process the data of the boundbox tag according to yolo's own code without change
        b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))
        bb = convert((w,h), b)
        out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')

 

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36

 

The above code needs to set the XML itself_ Path and txt_save_path
As you can see from the above code, for the position X of the object_ min,x_ max,y_ min,y_ Max, first obtain the coordinate center of its center point_ x,center_ Y and the length and width of the position box_ rect,height_ Rect, and then divide these four values by length and width to normalize the data. If the data is not in voc format, similar processing can be carried out according to this idea
If the data is in voc format, you can refer to my previous blog for step-by-step processing darknet uses its own data for training
According to the above process, each image generates a corresponding txt file to save its normalized position information, as shown in the following figure. The corresponding txt generated is as follows:

 

0 0.250925925926 0.576388888889 0.1 0.0263888888889
0 0.485185185185 0.578125 0.0685185185185 0.0201388888889

 

  • 1
  • 2

 

There are two license plates in the figure. Each line stores the information of a license plate. The first 0 indicates the label of the detected object. Because I have only one class, they are all 0
The last four digits are the normalized coordinates of the center point and the length and width of the position box
Finally, unify the file name of the image and the corresponding TXT, and copy it to the same folder (the txt corresponding to a.jpg is a.txt), as shown in the figure:

 


Note that the names of txt and corresponding JPG files are except the last jpg,. The suffix of TXT is different. Others must be exactly the same and need to be saved in the same folder. During the training process, the name of a.jpg will be directly replaced with a.txt to find the corresponding gt of the image. The corresponding GT file does not necessarily have to be in TXT format. If it is not in TXT format, you can modify this part of the code in the source code Replace JPG with the format suffix you need

 

(2). data file preparation
It has been pasted on the front data. The necessary items during training are "class", "train" and "backup". "names" should also be set to facilitate future calls.
"Class" indicates the number of categories you want to detect. If the detection category is 20, class=20
"Backup" refers to the cache and saved model during training. During the training, a suffix will be generated under this path Backup file, which updates the model every 100 steps to prevent the training terminal from suddenly saving the model. In addition, the model saved by training will also exist in this path. By default, a model named yolov3 will be saved for every 10000 steps (I can't remember clearly, I modified it myself)_ Number of iterations Weights, a yolov3 will be saved after the final training_ final. weights. These models will be saved in the backup path
"Names" is the path to save the name of the detected object, and names = plate names
"Train" is the list path of your training set, such as train = data / trainlist txt,trainlist.txt saves the paths of all training set images, as shown below

The file can be generated directly by the following command:

 

find image_path -name \*.jpg > trainlist.txt

 

  • 1

 

image_path is the path to your dataset

 

(3). cfg file preparation
If you want to call yolo v3, you can directly use yolo v3 under the cfg folder cfg, but the following modifications need to be made:
First, put the top

 

# Testing
batch=1
subdivisions=1
# Training
# batch=64
# subdivisions=16

 

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6

 

Change to

 

# Testing
#batch=1
#subdivisions=1
# Training
batch=64
subdivisions=16

 

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6

 

Batch refers to batch size, and subdivisions is to solve the problem of insufficient video memory for large batch size. Each time, the code only reads batch size / subdivisions images, as shown in the figure, 64 / 16 = 4, but the results of 16 times, that is, the results of 64 images, will be treated as one batch
(uncomment the testing part and annotate the train part when calling)

 

Then, according to the category detected by yourself, change the classes under each [yolo] (there are three [yolo]) to the category you need to detect. If only one class is detected, the classes=1
Then modify the value of the first filter above each [yolo], and the calculation method is (5+classes) * 3. If the classes are 1, it is 18. Comparison before and after modification:

 

[convolutional]
size=1
stride=1
pad=1
filters=255
activation=linear


[yolo]
mask = 0,1,2
anchors = 10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326
classes=80
num=9
jitter=.3
ignore_thresh = .5
truth_thresh = 1
random=1

 

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18

 

[convolutional]
size=1
stride=1
pad=1
filters=18
activation=linear


[yolo]
mask = 0,1,2
anchors = 10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326
classes=1
num=9
jitter=.3
ignore_thresh = .5
truth_thresh = 1
random=1

 

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17

 

random in the figure indicates the resize network mentioned in the paper to enhance the adaptability of the network. If the video memory is enough, it is best to set it to 1. If the video memory is not enough, it can also be set to 0, that is, network resizing is not carried out

 

(4) weights file preparation
If you use the cfg model structure provided by the author, such as yolov3, you can download the pre training model on its official website to initialize the model parameters, which can speed up the convergence. Of course, you can also train without using the pre training model

 

(5) Start training
If you use a pre training model, use the following command

 

./darknet detector train data/detect.data data/yolov3.cfg data/yolov3.weight

 

  • 1

 

Otherwise, use

 

./darknet detectortrain data/detect.data data/yolov3.cfg

 

  • 1

 

The command is similar to the calling command previously detected, except that test is changed to train

 

  1. Training of classification model
    (1) Data preparation
    Unlike detection, the classified gt only needs a label, and the information of the location box is no longer needed. Therefore, a separate txt file is no longer needed to save gt, but the label is directly reflected on the image name. Therefore, the image name needs to be renamed. The naming rule is:
    (picture serial number)_ (label) (picture format), such as 1_numzero.jpg
    It should be noted that:
    1.label is not case sensitive, that is, NumZero and NumZero have the same effect
    2. There can be no inclusive relationship between labels, such as Ji and jin. This form cannot appear. It can be changed to ji1 and jin

 

(2). data file preparation
Similar to detection:

 

classes=65
train  = data/char/train.list
labels = data/char/labels.txt
backup = backup/
top=2

 

  • 1
  • 2
  • 3
  • 4
  • 5

 

Top is not used in training, but in classification call, indicating the highest possible value of top output

 

(3). cfg file preparation
You can choose from the cfg folder or define it yourself

 

(4). weights file
Same as the above test

 

(5) Start training
Use command:

 

./darknet classifier train cfg/cifar.data cfg/cifar_small.cfg (xxx.weights)

 

Posted by indigobanana on Thu, 19 May 2022 11:47:05 +0300