Overview of data processing toolbox
Pytoch involves data processing (data loading, data preprocessing, data enhancement, etc.), and the main toolkits and related relationships are as follows:
Overview of pytorch data processing toolkit
torch.utils.data Toolkit
1) Dataset: abstract class. Other datasets should inherit this class and contain two methods__ getitem__ And__ len__.
2) DataLoader: define a new iterator to realize batch reading, shuffle data and provide acceleration function.
3)random_split: randomly split the data set into new non overlapping data sets of a given length.
4) * sampler: multiple sampling functions
Torch vision Toolkit
Installation: PIP install torch vision # or CONDA install torch vision
1) datasets: provides loading of common datasets, which are designed to inherit torch utils. data. Dataset, mainly including MNIST, CIFAR10/100, ImageNet and COCO datasets.
2) models: provide classic network structure in deep learning and trained models, such as AlexNet, VGG, ResNet, etc.
3) transforms: common data preprocessing operations, mainly including operations on Tensor and PIL Image objects
4) utils: two functions, one is make_grid, which collapses multiple pictures into a grid, and one is save_ Save Tensor as a picture.
utils.data
utils.data includes dataset and DataLoader. torch.utils.data.Dataset is an abstract class. The custom data collection should inherit this class and implement two functions. One is__ len__, The other is__ getitem__, The former provides data size, while the latter obtains data and labels through a given index.
__ getitem__ You can only get one data at a time, so you need to use torch utils. data. Dataloader to define a new iterator to realize batch reading.
1. Use Dataset
import torch from torch.utils import data import numpy as np class TestDataset(data.Dataset): def __init__(self): #Some data sets represented by two-dimensional vectors self.Data = np.asarray([[1,2],[3,4],[2,1],[3,4],[4,5]]) #This is the label corresponding to the data set self.Label = np.asarray([0,1,0,1,2]) def __getitem__(self,index): #numpy to Tensor txt = torch.from_numpy(self.Data[index]) label = torch.tensor(self.Label[index]) return txt,label def __len__(self): return len(self.Data) #Get data in dataset Test = TestDataset() #Equivalent to calling__ getitem__(2) , output [2,1] print(Test[2]) print(Test.__len__())
2. Use DataLoader
The Dataset is only responsible for data extraction and is called once__ getitiem__ Return the sample only once. If you want batch processing, you also need to perform shuffle and parallel acceleration processing at the same time. You can choose dataloader. The format of dataloader is:
data.DataLoader( dataset, bactch_size = 1, shuffle = False, sample = None, bactch_sampler = None, num_workers = 0, collate_fn = <function default_collate at 0x7f108ee01620>, pin_memory = False, drop_last = False, timeout = 0, worker_init_fn = None, ) # Description of main parameters: ''' dataset: Load dataset batch_size: Batch size shuffle: Whether to disrupt the data sampler: Sample sampling num_workers: The number of processes loaded using multiple processes. 0 indicates that it is not suitable for multiple processes collate_in: How to splice multiple sample data into one batch,Generally, the default splicing method can be used pin_memory: Save data in pin_memory District, pin_memory Data conversion in GPU It'll be faster. drop_last: dataset The data in may not be bacth_size An integral multiple of, drop_last by True There will be less than one more bacth Data discarding '''
Combined with the above procedures, an example is given:
test_loader = data.DataLoader(Test,batch_size=2,shuffle=False,num_workers=0) for i,traindata in enumerate(test_loader): print('i:',i) Data,Label=traindata print('data:',Data) print('Label:',Label)
torchvision
torchvision has four functional modules: model, datasets, transforms and utils
transforms
transforms provides common operations on PIL Image objects and Tensor objects
1. Common operations on PIL Image
Scale/Resize: adjust the size and keep the length width ratio unchanged
CenterCrop, RandomCrop, RandomSizeCrop: crop the picture. CenterCrop and RandomCrop are fixed size in the crop, and RandomResizeCrop is a crop of random size.
Pad: fill
ToTensor: convert a PIL Image with a value range of [0255] into Tensor. Numpy with shape (H, W, C) Darry is converted into a torch with a value range of [0,1.0] FloatTensor.
RandomHorizontalFlip: random horizontal flip of the image, with a flip probability of 0.5
Flip Vertical random image
ColorJitter: modify width, height, contrast, and saturation
2. Common operations for Tensor
Normalize: normalize, that is, subtract the mean value and divide it by the standard deviation
ToPILImage: converts Tensor to PIL Image.
If you want to perform multiple processing on data, you can use Compose to splice these operations like pipes, similar to NN Sequential()
transforms.Compose( [
# center cut the given PIL Image to get the given size
#size can be tuple, (target_height, targht_width)
#size can be an Integer. In this case, the cut image is square.
transforms.CenterCrop(10),
The position of # cutting center point is randomly selected
transforms.RandomCrop(20,padding = 0),
# set a PIL Image with a value range of [0255] or numpy with a shape of (H, W, c) Ndarray is converted into torch with shape (C,H,W) and value range [0,1] FloatTensor
transforms.ToTensor(),
# normalized to [- 1,1]
transforms.Normalize(mean = (0.5,0.5,0.5),std = (0.5,0.5,0.5))
])
ImageFolder
When the document is under different documents according to the label, such as
-data
|-zhangliu
| |-001.jpg
| |-002.jpg
|-wuhua
| |-001.jpg
| |-002.jpg
We can use torch vision datasets. Imagefolder to directly construct the dataset. The code is as follows:
loader = datasets.ImageFold(path)
loader = data.DataLoader(datasets)
ImageFolder will automatically convert the folder name in the directory into a sequence. When the DataLoader loads, the label will automatically be an integer sequence.
e.g: use ImageFolder to read image data in different directories, and then use transforms for preprocessing. There are multiple preprocessing operations. Use Compose to splice these operations together and use DataLoader to load.
#### where the data set trochvision_data can be created using any image and placed in the folder where the program runs, named torchvision_data
import os os.environ["KMP_DUPLICATE_LIB_OK"] = "TRUE" from torchvision import transforms, utils from torchvision import datasets import torch import matplotlib.pyplot as plt from torch.utils import data my_trans=transforms.Compose([ transforms.RandomResizedCrop(224), transforms.RandomHorizontalFlip(), transforms.ToTensor() ]) train_data = datasets.ImageFolder('./torchvision_data', transform=my_trans) train_loader = data.DataLoader(train_data,batch_size=8,shuffle=True,) for i_batch, img in enumerate(train_loader): if i_batch == 0: print(img[1]) fig = plt.figure() grid = utils.make_grid(img[0]) plt.imshow(grid.numpy().transpose((1, 2, 0))) plt.show() utils.save_image(grid,'test01.png') break
Operation results: