Three evaluation machine learning models

To sum up, the evaluation model divides the data into three parts: training, verification and testing

  1. Simple persistence validation set
    Select part of the data as the test set, train on the remaining data, and finally evaluate on the test set.

num_validation_samples = 10000
# Shuffling the data is usually appropriate
# Define the validation set
validation_data = data[:num_validation_samples]
data = [num_validation_samples:]
# Define the training set
training_data = data[:]
# Train a model on the training data
# and evaluate it on the validation data
model = get_model()
validation_score = model.evaluate(validation_data)
# At this point you can tune your model,
# retrain it, evaluate it, tune it again...
# Once you have tuned your hyperparameters,
# is it common to train your final model from scratch
# on all non-test data available.
model = get_model()
test_score = model.evaluate(test_data)

np.random.shuffle is used to randomly disrupt the data to prevent the order of the original stored data from affecting the training. The next two sentences respectively mean to take the previous number from all the data as the verification set, and then change the total data to the remaining part of the original, and then assign it to the training set.
However, the above is only the simplest version. For slightly more complex cases with small samples, our random division has a great impact on the results, so we need to use k-fold.

  1. k-fold verification
    First, divide the data you have into k equal parts, verify the i part respectively, and the rest of the training. The final score is obtained by averaging. This method can effectively solve the situation that your training results are divided by data.
    The k-fold method is as follows
k = 4
num_validation_samples = len(data) // k
validation_scores = []
for fold in range(k):
 # Select the validation data partition
 validation_data = data[num_validation_samples * fold: num_validation_samples * (fold + 1)]
 # The remainder of the data is used as training data.
 # Note that the "+" operator below is list concatenation, not summation
 training_data = data[:num_validation_samples * fold] + data[num_validation_samples * (fold + 1):]
 # Create a brand new instance of our model (untrained)
 model = get_model()
 validation_score = model.evaluate(validation_data)
# This is our validation score:
# the average of the validation scores of our k folds
validation_score = np.average(validation_scores)
# We train our final model on all non-test data available
model = get_model()
test_score = model.evaluate(test_data)

First, k is the number of copies to be divided. Since the number of copies that cannot be divided in practice is just a divisor of the total data volume, the second line uses / / to get the integer part of the result.
Then the division is very simple. It should be noted that the evaluation results of training should be saved and averaged after the end of the last k cycles.

Recommended links:

Posted by rhodesa on Sat, 14 May 2022 09:37:17 +0300