This article is from wechat official account [machine learning alchemy].

In the last class, we explained a small practical battle of MNIST image classification. Now we continue to learn some little knowledge of pytorch as a reserve.

Reference Catalogue:

@

## 1. Pytorch data structure

### 1.1 default integer and floating point numbers

[the default integer of pytorch is int64]

The default integer of pytorch is stored in 64 bits, that is, 8 bytes.

[the default floating point number of pytorch is float32]

The default floating-point number of pytorch is stored in 32 bits, that is, 4 bytes.

import torch import numpy as np #---------------------- print('torch The default data type for floating-point numbers and integers') a = torch.tensor([1,2,3]) b = torch.tensor([1.,2.,3.]) print(a,a.dtype) print(b,b.dtype)

Output:

torch The default data type for floating-point numbers and integers tensor([1, 2, 3]) torch.int64 tensor([1., 2., 3.]) torch.float32

### 1.2 dtype modify variable type

print('torch The default data type for floating-point numbers and integers') a = torch.tensor([1,2,3],dtype=torch.int8) b = torch.tensor([1.,2.,3.],dtype = torch.float64) print(a,a.dtype) print(b,b.dtype)

Output result:

torch The default data type for floating-point numbers and integers tensor([1, 2, 3], dtype=torch.int8) torch.int8 tensor([1., 2., 3.], dtype=torch.float64) torch.float64

### 1.3 what are the types of variables

The data type of tensor is actually numpy Array is basically one-to-one correspondence. Except that str is not supported, there are mainly the following forms:

torch.float64 # Equivalent to (torch.double) torch.float32 # By default, FloatTensor torch.float16 torch.int64 # Equivalent to torch long torch.int32 # default torch.int16 torch.int8 torch.uint8 # Binary code, representing 0-255 torch.bool

When I want to create a variable with a specific type in the constructor, I prefer to use the following keywords:

print('torch Constructor for') a = torch.IntTensor([1,2,3]) b = torch.LongTensor([1,2,3]) c = torch.FloatTensor([1,2,3]) d = torch.DoubleTensor([1,2,3]) e = torch.tensor([1,2,3]) f = torch.tensor([1.,2.,3.]) print(a.dtype) print(b.dtype) print(c.dtype) print(d.dtype) print(e.dtype) print(f.dtype)

Output result:

torch Constructor for torch.int32 torch.int64 torch.float32 torch.float64 torch.int64 torch.float32

Therefore, we can get the results:

- torch.IntTensor corresponds to torch int32
- torch.LongTensor corresponds to torch Int64, the tag values commonly used by longtensor in deep learning, such as category tags 0, 1, 2 and 3 in classification tasks, require the data type of ing64;
- torch.FloatTensor corresponds to torch float32. FloatTensor is often used to learn parameters or input data types in deep learning
- torch.DoubleTensor corresponds to torch float64
- torch.tensor has the ability to infer. If the input data is an integer, it defaults to int64, which is equivalent to LongTensor; If the input data is a floating point number, the default is float32, which is equivalent to FLoatTensor. It just corresponds to the data types of labels and parameters in deep learning, so in general, it's OK to use tensor directly, but if there is an error, you should also learn to use dtype or constructor to ensure the matching of data types

### 1.4 data type conversion

[use torch.float() method]

print('Data type conversion') a = torch.tensor([1,2,3]) b = a.float() c = a.double() d = a.long() print(b.dtype) print(c.dtype) print(d.dtype) >>> Data type conversion >>> torch.float32 >>> torch.float64 >>> torch.int64

I'm personally used to this method.

[using type method]

b = a.type(torch.float32) c = a.type(torch.float64) d = a.type(torch.int64) print(b.dtype) # torch.float32 print(c.dtype) # torch.float64 print(d.dtype) # torch.int64

## 2 torch vs numpy

PyTorch is a python package designed to add deep learning applications. torch basically realizes most of the necessary functions of numpy, and tensor can use GPU for accelerated training.

### 2.1 conversion between the two

The conversion is very simple:

import torch import numpy as np a = np.array([1.,2.,3.]) b = torch.tensor(a) c = b.numpy() print(a) print(b) print(c)

Output result:

[1. 2. 3.] tensor([1., 2., 3.], dtype=torch.float64) [1. 2. 3.]

The following content becomes a little interesting, which is related to memory replication. If a and b variables share the same memory, then if you change a, b will also change; If the memory of variables A and b is copied, they are two memories, so changing a will not change b. The following is to explain what is shared memory and what is memory replication when numpy and torch are converted to each other (in fact, this problem is just an understanding, useless little knowledge)

[tensor conversion]

When the data type of numpy is the same as that of torch, the memory is shared; Memory replication at different times

print('numpy and torch Mutual conversion 1') a = np.array([1,2,3],dtype=np.float64) b = torch.Tensor(a) b[0] = 999 print('Shared memory' if a[0]==b[0] else 'No shared memory') >>> No shared memory

Because NP Float64 and torch Float32 data types are different

print('numpy and torch Mutual conversion 2') a = np.array([1,2,3],dtype=np.float32) b = torch.Tensor(a) b[0] = 999 print('Shared memory' if a[0]==b[0] else 'No shared memory') >>> Shared memory

Because NP Float32 and torch Float32 has the same data type

[from_numpy() conversion]

print('from_numpy()') a = np.array([1,2,3],dtype=np.float64) b = torch.from_numpy(a) b[0] = 999 print('Shared memory' if a[0]==b[0] else 'No shared memory') >>> Shared memory a = np.array([1,2,3],dtype=np.float32) b = torch.from_numpy(a) b[0] = 999 print('Shared memory' if a[0]==b[0] else 'No shared memory') >>> Shared memory

If you use from_ When numpy (), no matter what type it is, it is shared memory.

[tensor conversion]

More commonly used is this tensor(). Pay attention to the case of T. If the tensor method is used, no matter what the input type is, torch Tensor will copy data without sharing memory.

[.numpy()]

When tensor turns to numpy Numpy method is memory sharing. If you want to change to memory copy, you can use numpy().copy() does not share memory. Or use clone().numpy() can achieve the same effect. Clone is the method of tensor and copy is the method of numpy.

[summary]

If you can't remember clearly, just remember that the tensor() data is copied numpy() just shares memory.

### 2.2 difference between the two

[naming]

Although PyTorch implements many functions of Numpy, the same function has different naming methods, which puzzles users.

For example, when creating a random tensor:

print('Naming rules') a = torch.rand(2,3,4) b = np.random.rand(2,3,4)

[tensor remodeling]

This part will be described in detail in the next chapter~

## 3 tensor

- Scalar: data is a number
- Vector: data is a string of numbers and a one-dimensional tensor
- Two dimensional data array: two dimensional data tensor
- Tensor: when the dimension of data exceeds 2, it is called multidimensional tensor

### 3.1 tensor modification dimension

- Python often uses reshape and view
- numpy uses resize and reshape
- pytorch also has resize, but it is not commonly used

[reshape and view share memory (common)]

a = torch.arange(0,6) b = a.reshape((2,3)) print(b) c = a.view((2,3)) print(c) a[0] = 999 print(b) print(c)

Output result:

tensor([[0, 1, 2], [3, 4, 5]]) tensor([[0, 1, 2], [3, 4, 5]]) tensor([[999, 1, 2], [ 3, 4, 5]]) tensor([[999, 1, 2], [ 3, 4, 5]])

The above three variables a, b and c actually share the same memory, moving one to move the whole body. And it is required to follow the rules: the original data has 6 elements, so it can be modified into the form of \ (2\times 3 \), but it cannot be modified into the form of \ (2\times 4 \). Let's try:

a = torch.arange(0,6) b = a.reshape((2,4))

This error will be thrown:

[resize_ (not commonly used) of torch]

However, pytorch has an unusual function (not used much for me), resize. This method can not abide by this rule:

a = torch.arange(0,6) a.resize_(2,4) print(a)

The output result is:

Two elements are added automatically. Although I don't know what this function means......

Here you can see that there is a after the function resize, This means inplace=True when there is this_ Or the parameter inplace means that the modification is completed on the original data variable, so there is no need to assign a value to the new variable.

[resize and reshape of numpy (common)]

import numpy as np a = np.arange(0,6) a.resize(2,3) print(a)

import numpy as np a = np.arange(0,6) b = a.reshape(2,3) print(b)

The outputs of the two code blocks are as follows. The difference is that the resize of numpy has no return value, which is equivalent to inplace=True. It is directly modified in the original variable, while reshape has a return value and is not modified in the original variable (but reshape is shared memory):

[[0 1 2] [3 4 5]]

### 3.2 tensor memory storage structure

The data structure of tensor consists of two parts:

- Header information area Tensor: stores Tensor shape, size, step size, string, data type and other information
- Storage area: save real data

The memory occupied by Tensor in the header information area is small, and the main memory occupied is Storate.

Each tensor has a corresponding storage. Generally, different tensors may have different header information, but they may use the same storage. (here are the view and reshape methods of shared memory. Although the tensor shape size of header information has changed, the data stored is the same storage)

### 3.3 storage area

Let's check the storage area of a tensor:

import torch a = torch.arange(0,6) print(a.storage())

Output is:

0 1 2 3 4 5 [torch.LongStorage of size 6]

Then make a view transformation on the tensor variable:

b = a.view(2,3)

The output of b.storage() is the same as that of a.store (), which is why the view transformation is memory shared.

# id() is the memory address of the object print(id(a)==id(b)) # False print(id(a.storage)==id(b.storage)) # True

It can be found that although the storage areas of a and b are the same, in fact, a and b are different as a whole. Naturally, this difference is different in the header information area. The size should be changed. This means that the header information area is different, but the storage area is the same, which saves a lot of memory

Let's go a step further. Suppose the tensor is sliced, does the sliced data share memory, and what is the storage of the sliced data like?

print('Research tensor Slice of') a = torch.arange(0,6) b = a[2] print(id(a.storage)==id(b.storage))

The output result is:

>>> True

Yes, even after slicing, the two tensor s still use the same storage area, so they share memory compared with each other. Modifying one will change the other.

#.data_ptr(), return the memory address of the first element of tensor. print(a.data_ptr(),b.data_ptr()) print(b.data_ptr()-a.data_ptr())

Output is:

2080207827328 2080207827344 16

This is because the memory address of the first element of b is 16 bytes different from that of the first element of A. because the default tesnor is int64, that is, an element of 8 bytes, there is a difference of 2 shaping elements

### 3.4 header information area

It's still the two tensor variables above, a and b

a = torch.arange(0,6) b = a.view(2,3) print(a.stride(),b.stride())

Output is:

(1,) (3, 1)

Variable a is a one-dimensional array and is [0, 1, 2, 3, 4, 5], so the step size stripe is 1; b is a two-dimensional array, which is [[0,1,2], [3,4,5]], so three are divided into the first dimension, and then one is used as the second dimension.

It can be seen that most operations do not modify the data of the tensor, but only modify the header information of the tensor. This method saves more memory and improves the processing speed.