Feedforward neural network
Common feedforward neural networks include Perceptrons, BP (Back Propagation) networks, etc. Feedforward neural network (FNN) is the earliest type of simple artificial neural network invented in the field of artificial intelligence. The neurons were arranged in layers. Each neuron is only connected to the neurons in the previous layer. Receive the output of the previous layer and output it to the next layer. There is no feedback between layers. Inside it, parameters propagate unidirectionally from the input layer through the hidden layer to the output layer. Different from recurrent neural network, it will not form a directed ring inside it. The following figure is a schematic diagram of a simple feedforward neural network:
There is no feedback in the whole network, and the signal propagates unidirectionally from the input layer to the output layer, which can be represented by a directed acyclic graph
Perceptron is actually a neuron in the structure of neural network, so a perceptron constitutes the simplest neural network.
Perceptron is an artificial neural network with forward structure, which can be regarded as a directed graph, which is composed of multiple node layers, and each layer is connected to the next layer. In addition to the input node, each node is a neuron (or processing unit) with nonlinear activation function
Realize feedforward neural network
Previous blog s have mentioned how to build a pytorch GPU environment for windows system. We use pytorch to realize the first feedforward neural network:
I made detailed comments in the source code for reference
import torch import torch.nn as nn import torchvision.datasets as dsets #torchvision is a library for graphics processing, loading data sets import torchvision.transforms as transforms ''' torchvision.datasets This package contains MNIST,FakeData,COCO,LSUN,ImageFolder,DatasetFolder,ImageNet,CIFAR And other commonly used data sets, and provides some important parameter settings of data set settings, which can be called through simple data set settings. From these data sets, we can also see the main variables and functions of data set settings, which will also be of great help to our own data set settings in the future. The interfaces of the above data sets are basically similar. They include at least two common parameters transform and target_transform，So as to transform the input and target respectively ''' from torch.autograd import Variable #torch.autograd provides classes and functions for deriving arbitrary scalar functions. import torch.utils.data as Data #We need to use torch utils. data. Data loader loads data import matplotlib.pyplot as plt #Library required for drawing # Hyper Parameters / algorithm parameters are set according to experience and affect the weight and bias, such as the number of iterations, the number of hidden layers, the number of neurons per layer, learning rate, etc input_size = 784 hidden_size = 500 num_classes = 10 num_epochs = 5 batch_size = 100 learning_rate = 0.001 # MNIST Dataset train_dataset = dsets.MNIST(root='./data', #Specifies the directory of the dataset train=True, transform=transforms.ToTensor(), # transforms.ToTensor() will replace the empty's ndarray or PIL The image read by image is converted into Tensor format with shape (C,H, W), and / 255 is normalized to between [0,1.0] download=True) test_dataset = dsets.MNIST(root='./data', train=False, transform=transforms.ToTensor()) # Data Loader (Input Pipeline) train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True) test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=False) ''' dataset:Data set for loading data batch_size: Number of data loaded for batch training shuffle: In each Epoch Scramble data in ''' test_y=test_dataset.test_labels # Neural Network Model (1 hidden layer) class Net(nn.Module): #Initialize network structure def __init__(self, input_size, hidden_size, num_classes): super(Net, self).__init__() self.fc1 = nn.Linear(input_size, hidden_size) #Input layer, linear relationship self.relu = nn.ReLU()#Hidden layer, using ReLU function self.fc2 = nn.Linear(hidden_size, num_classes) #Output layer, linear relationship #forword parameter transfer function, data flow in the network def forward(self, x): out = self.fc1(x) out = self.relu(out) out = self.fc2(out) return out net = Net(input_size, hidden_size, num_classes) # Loss and Optimizer criterion = nn.CrossEntropyLoss() #Set loss to least squares loss optimizer = torch.optim.Adam(net.parameters(), lr=learning_rate) #Set optimizer, torch optim. Adam # Train the Model for epoch in range(num_epochs): for i, (images, labels) in enumerate(train_loader): #enumrate # Convert torch tensor to Variable images = Variable(images.view(-1, 28*28))#The picture size is 28 * 28 labels = Variable(labels) #pytorch is calculated by tensor, and the parameters in tensor are in the form of Variable # Forward + Backward + Optimize optimizer.zero_grad() # zero the gradient buffer outputs = net(images) loss = criterion(outputs, labels) loss.backward() optimizer.step() if (i+1) % 100 == 0: print ('Epoch [%d/%d], Step [%d/%d], Loss: %.4f' %(epoch+1, num_epochs, i+1, len(train_dataset)//batch_size, loss.item())) #Output the results every 100 step s # Test the Model correct = 0 total = 0 for images, labels in test_loader: images = Variable(images.view(-1, 28*28)) outputs = net(images) _, predicted = torch.max(outputs.data, 1) total += labels.size(0)#Calculate the number of all label s correct += (predicted == labels).sum()#Calculation of predicted number of label s print('Accuracy of the network on the 10000 test images: %d %%' % (100 * torch.true_divide(correct, total))) # Save the Model for i in range(1,4): plt.imshow(train_dataset.train_data[i].numpy(), cmap='gray') plt.title('%i' % train_dataset.train_labels[i]) plt.show() torch.save(net.state_dict(), 'model.pkl') #net.state_dict(), model file test_output = net(images[:20]) pred_y = torch.max(test_output, 1).data.numpy().squeeze() print('prediction number',pred_y) print('real number',test_y[:20].numpy())
Least squares Loss
class torch.nn.CrossEntropyLoss(weight=None, size_average=True)[source]
This standard integrates LogSoftMax and NLLLoss into one class.
This method is very useful when training a multi class classifier.
weight(tensor): 1-D tensor, n elements, representing the weights of N classes respectively, is very useful if your training samples are very unbalanced. The default value is None.
Call time parameters:
input: contains the score of each class, 2-D tensor, and shape is batch*n
target: 1-D tensor with size n, including the index of the category (0 to n-1).
Loss can be expressed in the following form:
When the weight parameter is specified, the calculation formula of loss becomes:
torch.optim is a library that implements various optimization algorithms. Most commonly used methods are supported, and the interface has enough universality to integrate more complex methods in the future.
To use torch Optim, you need to build an optimizer object. This object can maintain the current parameter state and update the parameters based on the calculated gradient.
In order to build an optimizer, you need to give it an iterable that contains the parameters to be optimized (which must all be Variable objects). Then, you can set the parameter options of optimizer, such as learning rate, weight attenuation, and so on.
optimizer = optim.SGD(model.parameters(), lr = 0.01, momentum=0.9) optimizer = optim.Adam([var1, var2], lr = 0.0001)
class torch.optim.Adam(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0)[source]
- params (iterable) – iterable of the parameter to be optimized or dict lr (float, optional) with parameter group defined
- Learning rate (default: 1e-3) betas (Tuple[float, float], optional) –
- Coefficient used to calculate the running average of gradient and gradient square (default: 0.9, 0.999) eps (float, optional) –
- The term (default: 1e-8) added to the denominator to increase the stability of numerical calculation_ Deck (float, optional) –
Weight attenuation (L2 penalty) (default: 0)
output = torch.max(input, dim)
input is a tensor output from the softmax function
dim is the dimension 0 / 1 of the max function index. 0 is the maximum value of each column and 1 is the maximum value of each row
The function will return two tensors. The first tensor is the maximum value of each line, and the maximum output of softmax is 1, so the first tensor is all 1 tensors; The second tensor is the index of the maximum value of each row.
State in pytorch_ Dict is a simple python dictionary object that maps each layer to its corresponding parameters (such as weights and offsets of each layer of the model)
(note that only those layers whose parameters can be trained will be saved in the state_dict of the model, such as convolution layer, linear layer, etc.)
import numpy as np x = np.array([[, , ]]) print(x) """ x= [[  ]] """ print(x.shape) # (1, 3, 1) x1 = np.squeeze(x) # Delete the one-dimensional entry from the shape of the array, that is, remove the dimension with 1 in the shape print(x1) # [0 1 2] print(x1.shape) # (3,)