1, Basic process of logistic regression
1. Logistic regression
Learn the law from multiple eigenvalues of a group of samples, establish a model (the learning model is abstracted as a formula f(x)), and use this model to predict the results of other samples. Logistic regression is to classify the prediction results.
Logistic regression steps:
(1) Normalize the value of x
(2) Write weight function z=w1x1+w2x2 +... + wixi
(3) Write activation function a= σ (z)
(4) Write loss function L=-ylog(a)-(1-y)log(1-a)
(5) Make gradient descent w=w- α dw
(6) Using the trained weight parameter w to make prediction
2. Activation function
Set the weight wi for each feature, and the calculated z=f(x) will have many values. If you want to classify the calculation result Z between 0 and 1, use the activation function sigmoid:y=1/(1+e^(-z)). At this time, the predicted value is obtained.
Characteristics of activation function: continuous and differentiable, nonlinear transformation.
If there are many functions (- 1,moz > 0), then reltanz = 0
3. Loss function
Function of loss function:
(1) Establish a relationship between the real value and the predicted value, L=-ylog(a)-(1-y)log(1-a)
(2) It can measure the gap between the real value and the predicted value. When y is equal to 0, the value of L can be calculated, and the gap is known
(3) The weights w and b can be deduced through the gap between a and y
4. Gradient descent and learning rate
The essence of the loss function is the function between L and W. the gradient descent can continuously find w that minimizes the loss function.
The learning rate controls the weight w and the step size of each iteration.
5. Normalization
(1)x'=(x-xmin)/(xmax-xmin)
(2)x '= (x-mean) / standard deviation
There are many normalization formulas. The purpose is to accelerate the convergence speed of the model and prevent model training from taking many detours.
2, Iris actual combat
Item description: according to the characteristic data statistics of the length and width of iris calyx, it is classified by logistic regression
Data characteristics: calyx length, calyx width
Category label: 0-mountain iris, 1-variegated iris, 2-virginia iris
Step 1: import numpy, matplotlib and dataset
import numpy as np import matplotlib.pyplot as plt from sklearn.datasets import load_iris
Step 2: visualize data and analyze data
iris = load_iris() iris.feature_names, iris.target_names iris.data/#Can see the specific information of characteristic data iris.target/#You can see the value of the label of each row of data ##Take 100 samples and take the first two columns of features, calyx length and width x=iris.data[0:100,0:2] y=iris.target[0:100] ##Take the first two types of samples, 0 and 1 respectively samples_0 = x[y==0, :]#Take out the sample with y=0 samples_1 = x[y==1, :] #Scatter visualization plt.scatter(samples_0[:,0],samples_0[:,1],marker='o',color='r') plt.scatter(samples_1[:,0],samples_1[:,1],marker='x',color='b') plt.xlabel('X') plt.ylabel('Y')
Step 3: split data, 80 training data and 20 test data
x_train=np.vstack([x[:40,:],x[60:100,:]])#Take the data of the first 40 and the last 40 y_train=np.concatenate([y[:40],y[60:100]]) x_test=x[40:60,:] y_test=y[40:60]
Step 4: implementation of logistic regression algorithm
class Logistic_Regression(): def __init__(self): self.w=None def sigmoid(self,z): a=1/(1+ np.exp(-z)) return a def output(self,x): z=np.dot(self.w,x.T) a=self.sigmoid(z) return a def compute_loss(self,x,y): num_train=x.shape[0] a=self.output(x) loss=np.sum(-y*np.log(a)-(1-y)*np.log(1-a))/num_train dw=np.dot((a-y),x)/num_train return loss,dw def train(self,x,y,learning_rate=0.01,num_iterations=10000): num_train,num_features=x.shape self.w=0.001*np.random.randn(1,num_features) loss=[] for i in range(num_iterations): error,dw=self.compute_loss(x,y) loss.append(error) self.w-=learning_rate*dw if i%200==0: print('steps:[%d/%d],loss:%f'%(i,num_iterations,error)) return loss def predict(self,x): a=self.output(x) y_pred=np.where(a>=0.5,1,0) return y_pred
Step 5: create lr instance and train model
lr=Logistic_Regression() loss=lr.train(x_train,y_train) plt.plot(loss) ##Decision boundary visualization plt.scatter(samples_0[:,0],samples_0[:,1],marker='o',color='r') plt.scatter(samples_1[:,0],samples_1[:,1],marker='x',color='b') plt.xlabel('x') plt.ylabel('y') x1=np.arange(4,7.5,0.05) x2=(-lr.w[0][0]*x1)/lr.w[0][1] #sigmoid=1/(1+np.exp(-x)) #x1*w1+x2*w2=0 plt.plot(x1,x2,'-',color='black')
Step 6: prediction on test set
num_test=x_test.shape[0] prediction=lr.predict(x_test) accuracy=np.sum(prediction==y_test)/num_test print(r'the accuracy of prediction is :', accuracy)