Regression Prediction of Vehicle Efficiency Using Fully Connected Neural Networks
keyword: fully connected neural network, tensorflow, regression
illustrate
It mainly uses the fully connected neural network to predict the regression problem of the car's performance index MPG.
python packages include: os, pandas, tensorflow, sklearn, matplotlib
data loading
The dataset is Auto MPG . Mainly about the performance indicators of various cars.
Except for the origin, all other fields are numeric. Origin is a category field, 1 America, 2 Europe, 3 Japan.
- MPG: miles per gallon (efficiency index, y)
- Cylinders: Number of cylinders
- Displacement: displacement
- Horsepower: horsepower
- Weight: weight
- Acceleration: acceleration
- Model Year: Model Year
- Origin: Origin
import os import pandas as pd data_path = os.path.join(os.getcwd(), r'data\auto-mpg.data') column_names = [ 'MPG','Cylinders','Displacement','Horsepower','Weight', 'Acceleration', 'Model Year', 'Origin' ] raw_data = pd.read_csv( data_path, names=column_names, na_values='?', comment='\t', sep=' ', skipinitialspace=True )
After the loading is completed, some observations and backups of the data are generally performed.
df.info() df.head() df.describe() df.tail() df.to_pickle(r'D:\data\auto-mpg.pkl')
data processing
Through data observation, it is found that there are missing values in the data. One field origin is categorical data. In addition, gradient descent is used to find optimal parameters, preferably normalized.
Missing value handling
df.isna().sum()
Choose to delete directly.
df = df.dropna()
Feature processing
# Process categorical data, where the origin column represents categories 1, 2, 3, and the distribution represents the origin: United States, Europe, Japan # Pop (delete and return) the origin column first origin = df.pop('Origin') # write new 3 columns based on the origin column df.loc[:, 'USA'] = (origin == 1)*1.0 df.loc[:, 'Europe'] = (origin == 2)*1.0 df.loc[:, 'Japan'] = (origin == 3)*1.0
Data partitioning and normalization
Standardization here is to deal directly with the definition of standardization. Of course, it is easiest to use the sklearn.preprocessing.StandardScaler method directly.
The data is directly divided into training and test sets in a ratio of 8:2.
from sklearn.model_selection import train_test_split # Divide the training set and test set in a ratio of 8:2 x_columns = df.columns.to_list() x_columns.remove('MPG') y_columns = ['MPG'] x_sample = df[x_columns] y_sample = df[y_columns] train_dataset, test_dataset, train_labels, test_labels = train_test_split(x_sample, y_sample, test_size=0.2, random_state=0)
The data is normalized.
# Calculate the mean and standard deviation of the values of each field in the training set, and complete the standardization of the data train_stats = train_dataset.describe() train_stats = train_stats.transpose() # normalized data def norm(x): return (x - train_stats['mean']) / train_stats['std'] normed_train_data = norm(train_dataset) normed_test_data = norm(test_dataset)
Print out the training set and test set size:
# print the size of the training set and test set print(f'The size of the training set is:{normed_train_data.shape}, Number of labels:{train_labels.shape}') print(f'The size of the test set is:{normed_test_data.shape}, Number of labels:{test_labels.shape}')
Data observation
Mainly through the statistics of data characteristics, mapping and other methods, to have a deeper understanding of the data. So as to help the selection of the model, or do further data processing and so on.
Feature name:
normed_train_data.columns.to_list()
Scatter plot between features to observe the linear relationship between features:
from pandas.plotting import scatter_matrix attributes = [ 'Cylinders', 'Displacement', 'Horsepower', 'Weight', 'Acceleration' ] scatter_matrix(normed_train_data[attributes], figsize=(12, 8))
Figure 1 Linear relationship between features
Observe the linear relationship between the feature and the target.
df.plot.scatter( x='Cylinders', y='MPG' )
Figure 2 Linear relationship between features and targets
Create a network model
Due to the relatively small amount of data, only a 3-layer fully connected network is created to complete the MPG prediction task.
The input feature is 9, so the number of input nodes is 9. The number of output nodes in the middle two hidden layers is 64, 64. Since there is only one predicted value, the output layer node is 1. And the output layer is a numerical prediction, so you can No activation function is added, or the ReLU activation function is added.
The network construction method of tensorflow is very flexible, so there are several ways to build it below.
way 1
network construction
from tensorflow.keras import layers from tensorflow import keras import tensorflow as tf class Network(keras.Model): # regression network def __init__(self): super(Network, self).__init__() # Create 3 fully connected layers self.fc1 = layers.Dense(64, activation='relu') self.fc2 = layers.Dense(64, activation='relu') self.fc3 = layers.Dense(1) def call(self, inputs, training=None, mask=None): # Pass through 3 fully connected layers in sequence x = self.fc1(inputs) x = self.fc2(x) x = self.fc3(x) return x
Train the model
model = Network() # The build function completes the creation of the internal tensor, where 4 is an arbitrary batch number and 9 is the input feature length model.build(input_shape=(4, 9)) # print network information model.summary() # Create an optimizer, specifying a learning rate optimizer = keras.optimizers.RMSprop(0.001)
Model: "network_1" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= dense_3 (Dense) multiple 640 _________________________________________________________________ dense_4 (Dense) multiple 4160 _________________________________________________________________ dense_5 (Dense) multiple 65 ================================================================= Total params: 4,865 Trainable params: 4,865 Non-trainable params: 0
Build the Dataset object:
# data # Build the Dataset object train_db = tf.data.Dataset.from_tensor_slices(( normed_train_data.values, train_labels.values )) # Randomly scattered, batched train_db = train_db.shuffle(100).batch(32)
Train the model
loss_log = list() i = 0 for epoch in range(100): for step, (x, y) in enumerate(train_db): # Gradient recorder with tf.GradientTape() as tape: out = model(x) loss = tf.reduce_mean(tf.losses.MSE(y, out)) # mae_loss = tf.reduce_mean(tf.losses.MAE(y, out)) i += 1 loss_log.append([i, float(loss)]) grads = tape.gradient(loss, model.trainable_variables) optimizer.apply_gradients(zip(grads, model.trainable_variables))
parameter view
For regression problems, the evaluation indicators generally include MSE (mean square error), RMSE (root mean square error), and MAE (mean absolute error). Here we choose MSE for evaluation.
epoch, is the mark of each batch of data, equal to i, which is the epoch*step above.
loss_log_df = pd.DataFrame(loss_log, columns=['epoch', 'MSE']) epoch = loss_log_df['epoch']
MSE is loss
mse = loss_log_df['MSE']
MSE of the test set,
# Test set results # normed_test_data, test_labels out = model(normed_test_data.values) test_mse = tf.reduce_mean(tf.losses.MSE(test_labels.values, out))
way 2
Model building
network = Sequential([ layers.Dense(64, activation='relu'), layers.Dense(64, activation='relu'), layers.Dense(1) ]) network.build(input_shape=(None, 9)) network.summary()
network.compile( optimizer=keras.optimizers.RMSprop(0.001), loss='mse', metrics=['mse', 'mae'] )
EPOCH = 200 history = network.fit( normed_train_data.values, train_labels.values, epochs=EPOCH, validation_split=0.2 )
parameter view
The indicator used here is MSE.
The final result of the test set, a point result:
out = network.predict(normed_test_data.values) test_mse = tf.reduce_mean(tf.losses.MSE(test_labels.values, out))
Intermediate results of training, in history.history,
history.history.keys()
dict_keys(['loss', 'mse', 'mae', 'val_loss', 'val_mse', 'val_mae'])
The training error, and the validation set error are as follows:
mse = history.history['mse'] val_mse = history.history['val_mse'] epoch = range(EPOCH)
drawing
Way 1 - Results
View Results
loss_log_df.plot.line( x='epoch', y='MSE' )
As shown in Figure 3, the final training set MSE is 2.66, and the test set result is 6.33.
Figure 3 MSE of each set of data
Way 2 - Results
import matplotlib.pyplot as plt plt.rcParams['font.sans-serif'] = ['KaiTi'] plt.rcParams['axes.unicode_minus'] = False # Solve the problem that the minus sign '-' is displayed as a square when saving an image plt.plot(epoch, mse, 'r', label='train mse') plt.plot(epoch, val_mse, 'b', label='validate mse') plt.plot(EPOCH, float(test_mse), 'go', label='test mse') plt.title('training set and validation set MSE,final test set MSE') plt.xlabel('epoch') plt.ylabel('MSE') plt.legend()
As shown in Figure 4, the final training set MSE is 2.82, the validation set MSE is 7.50, and the test set MSE is 7.60.
Figure 4 MSE of training set and validation set, final MSE of test set
from sklearn.metrics import r2_score out_r2 = r2_score(out, test_labels.values) print(out_r2)
Direct MSE requires a better understanding of the data in order to see whether the model is good or bad, which is not very intuitive. R2 Score is more intuitive, the closer it is to 1, the better the model. The R2 Score of this model on the test set is 0.88.
in conclusion
The regression prediction of the performance index MPG of the car is done through the fully connected neural network. The main modeling process, including data loading, data processing, data observation, and creating models. Creating a model includes model selection, model training, and model evaluation.
The problem is clear, it's a regression problem, and the business is clear. And the model has already been selected. Through the drawing section, you can visually see the model training process and get evaluation indicators, including MSE, R2 score, etc. The final R2 score is 0.88.
Fully connected neural network is the most basic neural network model. Neural networks are generally considered to be an end-to-end class of models that do not require feature processing. This is in terms of images and audio, and the neural network will automatically extract features. However, in dealing with some problems, doing some feature processes can make model training faster and reduce model complexity.
refer to
1.TensorFlow deep learning, https://github.com/dragen1860/Deep-Learning-with-TensorFlow-book ,2019