weibu's in-depth learning chapter: pytorch -- utils Data and torch vision

Overview of data processing toolbox

Pytoch involves data processing (data loading, data preprocessing, data enhancement, etc.), and the main toolkits and related relationships are as follows:

Overview of pytorch data processing toolkit

torch.utils.data Toolkit

1) Dataset: abstract class. Other datasets should inherit this class and contain two methods__ getitem__ And__ len__.

2) DataLoader: define a new iterator to realize batch reading, shuffle data and provide acceleration function.

        3)random_split: randomly split the data set into new non overlapping data sets of a given length.

4) * sampler: multiple sampling functions

Torch vision Toolkit

Installation: PIP install torch vision # or CONDA install torch vision

1) datasets: provides loading of common datasets, which are designed to inherit torch utils. data. Dataset, mainly including MNIST, CIFAR10/100, ImageNet and COCO datasets.

2) models: provide classic network structure in deep learning and trained models, such as AlexNet, VGG, ResNet, etc.

3) transforms: common data preprocessing operations, mainly including operations on Tensor and PIL Image objects

4) utils: two functions, one is make_grid, which collapses multiple pictures into a grid, and one is save_ Save Tensor as a picture.


        utils.data includes dataset and DataLoader. torch.utils.data.Dataset is an abstract class. The custom data collection should inherit this class and implement two functions. One is__ len__, The other is__ getitem__, The former provides data size, while the latter obtains data and labels through a given index.

       __ getitem__ You can only get one data at a time, so you need to use torch utils. data. Dataloader to define a new iterator to realize batch reading.

1. Use Dataset

import torch 
from torch.utils import data
import numpy as np
class TestDataset(data.Dataset):
def __init__(self):
#Some data sets represented by two-dimensional vectors
        self.Data = np.asarray([[1,2],[3,4],[2,1],[3,4],[4,5]]) 
#This is the label corresponding to the data set
        self.Label = np.asarray([0,1,0,1,2])  
    def __getitem__(self,index):
        #numpy to Tensor
        txt = torch.from_numpy(self.Data[index])
        label = torch.tensor(self.Label[index])
        return txt,label
    def __len__(self):
        return len(self.Data)
#Get data in dataset
Test = TestDataset()
#Equivalent to calling__ getitem__(2) , output [2,1]

2. Use DataLoader

The Dataset is only responsible for data extraction and is called once__ getitiem__ Return the sample only once. If you want batch processing, you also need to perform shuffle and parallel acceleration processing at the same time. You can choose dataloader. The format of dataloader is:

		bactch_size = 1,
		shuffle = False,
		sample = None,
		bactch_sampler = None,
		num_workers = 0,
		collate_fn = <function default_collate at 0x7f108ee01620>,
		pin_memory = False,
		drop_last = False,
		timeout = 0,
		worker_init_fn = None,
# Description of main parameters:

       dataset: Load dataset
       batch_size: Batch size
       shuffle: Whether to disrupt the data
       sampler: Sample sampling
       num_workers: The number of processes loaded using multiple processes. 0 indicates that it is not suitable for multiple processes
       collate_in: How to splice multiple sample data into one batch,Generally, the default splicing method can be used
       pin_memory: Save data in pin_memory District, pin_memory Data conversion in GPU It'll be faster.
       drop_last: dataset The data in may not be bacth_size An integral multiple of, drop_last by True There will be less than one more bacth Data discarding

Combined with the above procedures, an example is given:

test_loader = data.DataLoader(Test,batch_size=2,shuffle=False,num_workers=0)
for i,traindata in enumerate(test_loader):


torchvision has four functional modules: model, datasets, transforms and utils


transforms provides common operations on PIL Image objects and Tensor objects

1. Common operations on PIL Image

Scale/Resize: adjust the size and keep the length width ratio unchanged

CenterCrop, RandomCrop, RandomSizeCrop: crop the picture. CenterCrop and RandomCrop are fixed size in the crop, and RandomResizeCrop is a crop of random size.

Pad: fill

ToTensor: convert a PIL Image with a value range of [0255] into Tensor. Numpy with shape (H, W, C) Darry is converted into a torch with a value range of [0,1.0] FloatTensor.

RandomHorizontalFlip: random horizontal flip of the image, with a flip probability of 0.5

Flip Vertical random image

ColorJitter: modify width, height, contrast, and saturation

2. Common operations for Tensor

Normalize: normalize, that is, subtract the mean value and divide it by the standard deviation

ToPILImage: converts Tensor to PIL Image.

If you want to perform multiple processing on data, you can use Compose to splice these operations like pipes, similar to NN Sequential()

       transforms.Compose( [

# center cut the given PIL Image to get the given size

#size can be tuple, (target_height, targht_width)

#size can be an Integer. In this case, the cut image is square.


The position of # cutting center point is randomly selected

              transforms.RandomCrop(20,padding = 0),

# set a PIL Image with a value range of [0255] or numpy with a shape of (H, W, c) Ndarray is converted into torch with shape (C,H,W) and value range [0,1] FloatTensor


# normalized to [- 1,1]

              transforms.Normalize(mean = (0.5,0.5,0.5),std = (0.5,0.5,0.5))



When the document is under different documents according to the label, such as



       |    |-001.jpg

       |    |-002.jpg


       |      |-001.jpg

       |      |-002.jpg

We can use torch vision datasets. Imagefolder to directly construct the dataset. The code is as follows:

       loader = datasets.ImageFold(path)

       loader = data.DataLoader(datasets)

ImageFolder will automatically convert the folder name in the directory into a sequence. When the DataLoader loads, the label will automatically be an integer sequence.

e.g: use ImageFolder to read image data in different directories, and then use transforms for preprocessing. There are multiple preprocessing operations. Use Compose to splice these operations together and use DataLoader to load.

#### where the data set trochvision_data can be created using any image and placed in the folder where the program runs, named torchvision_data

import os
os.environ["KMP_DUPLICATE_LIB_OK"] = "TRUE"
from torchvision import transforms, utils
from torchvision import datasets
import torch
import matplotlib.pyplot as plt 
from torch.utils import data
train_data = datasets.ImageFolder('./torchvision_data', transform=my_trans)
train_loader = data.DataLoader(train_data,batch_size=8,shuffle=True,)
for i_batch, img in enumerate(train_loader):
    if i_batch == 0:
        fig = plt.figure()
        grid = utils.make_grid(img[0])
        plt.imshow(grid.numpy().transpose((1, 2, 0)))

Operation results:



Tags: AI Pytorch Deep Learning Object Detection

Posted by sid on Sun, 15 May 2022 05:28:39 +0300