More code: Gitee homepage: https://gitee.com/GZHzzz
Blog homepage: CSDN: https://blog.csdn.net/gzhzzaa
0 is written in front
 This article does not involve the depth of each algorithm principle level, the purpose is to help understand these model fusion methods from a macro perspective
1 Voting
Starting from the simplest Voting, this can be said to be the most intuitive and simple model fusion. Assuming that for a twoclass problem, there are 3 basic models, then a voting method is adopted, and the one with the most votes is determined as the final classification 🤔
2 Averaging
For regression problems, a simple and straightforward idea is to take the average. A slightly improved approach is to do a weighted average. The weights can be determined by sorting. For example, there are three basic models A, B, and C. The model effects are ranked. Suppose the rankings are 1, 2, and 3, respectively. Then the weights assigned to these three models are respectively Yes 3/6, 2/6, 1/6
These two methods seem to be simple, but in fact, the advanced algorithms that follow can also be said to be based on this. Bagging or Boosting is a kind of idea that combines many weak classifiers into strong classifiers 😁
3 Bagging
Bagging is to use the method of sampling with replacement, build a submodel with the sampled samples, and train the submodel. This process is repeated many times, and finally the fusion is performed. It can be roughly divided into two steps:

Repeat K times, repeat sampling modeling with replacement, and train submodels

Model fusion, classification problem: voting, regression problem: average
Random forest is a typical example based on Bagging algorithm, the base classifier used is decision tree
4 Boosting
The Bagging algorithm can be processed in parallel, and the idea of Boosting is an iterative method. In each training, it pays more attention to the misclassified examples, and adds greater weight to these misclassified examples. The goal of the next iteration is to It is easier to identify examples that were misclassified in the previous round. Finally, these weak classifiers are weighted and summed
Similarly, based on the idea of Boosting, there are AdaBoost, GBDT, etc.
5 Stacking
 stacking is an ensemble idea, and many ensemble algorithms are variants of it. To be precise, the stacking method uses the "learning" method to fuse the models (compare weighted fusion and average fusion, these two use a rule and a formula to fuse the prediction results of several models), that is, put the required The prediction results of several fused models are fused by another learning model. This learning model for fusion is called a metalearner, and several single models are primary learners
 Stacking is essentially such a direct idea, but this is definitely not possible. The problem is that the acquisition of the primary learner is problematic. The model trained with the entire training set is used to predict the label of the training set in turn, which is undoubtedly overfitting. is very, very serious, so the problem now becomes how to get a primary learner under the premise of solving overfitting, which becomes a familiar rhythm  Kfold crossvalidation
Steps of satcking fusion:
step1: Train T primary learners, and use the crossvalidation method to train on the Train Set (because the data for establishing the metalearner in the second stage is output by the primary learner, if the generalization ability of the primary learner is low, the metalearning will also overfit)
step2: The predicted values output by T primary learners on the Train Set are used as the training data D of the metalearner. There are T primary learners, and there are T features in D. The label of D is the same as the label when training the primary learner
step3: The predicted values output by the T primary learners on the Test Set are used as the test set when training the metalearner. There are also T models with T features.
step4: Train the metalearner, the label of the metalearner training set D is the same as the label when training the primary learner
 In fact, when training the metalearner in the second layer, the idea of crossvalidation can also be used to improve the predictive ability of the metalearner on the test set.
show me code, no bb
The following is the code of stacking fusion, which can be saved as a py file and called directly 😎
def stack_model(oof_1, oof_2, oof_3, predictions_1, predictions_2, predictions_3, y, eval_type='regression'): # Part 1. Data Preparation # Splicing columns by row, splicing all prediction results of the validation set # train_stack is the training data of the final model train_stack = np.hstack([oof_1, oof_2, oof_3]) # Concatenate columns by row to concatenate all predictions on the test set # test_stack is the test data of the final model test_stack = np.hstack([predictions_1, predictions_2, predictions_3]) # Create an allzero array with the same number of rows as the validation set, oof = np.zeros(train_stack.shape[0]) # Create an allzero array with the same number of rows as the test set predictions = np.zeros(test_stack.shape[0]) # Part 2. Multiple rounds of crossvalidation (for the crossvalidation of the second step of stacking): divided into 5 folds, and Bayesian regression is used to train the model on the training set of each fold (model fusion process) #on the validation set from sklearn.model_selection import RepeatedKFold folds = RepeatedKFold(n_splits=5, n_repeats=2, random_state=2020) # fold_ is the number of folds, trn_idx is the training set index for each fold, and val_idx is the validation set index for each fold for fold_, (trn_idx, val_idx) in enumerate(folds.split(train_stack, y)): # print fold information print("fold n°{}".format(fold_+1)) # The training set is divided into samples and labels of the training set trn_data, trn_y = train_stack[trn_idx], y[trn_idx] # The samples and labels in the training set divided into the validation set val_data, val_y = train_stack[val_idx], y[val_idx] # Prompt to start training print("" * 10 + "Stacking " + str(fold_+1) + "" * 10) # Using Bayesian regression as the final model for the fusion of results clf = BayesianRidge() # train on training data clf.fit(trn_data, trn_y) # Make predictions on the validation data and record the results in the corresponding position of oof (used to calculate evaluation indicators) oof[val_idx] = clf.predict(val_data) # Predict the test set data, and each round of prediction results accounts for an additional 1/10 (multiround crossvalidation: n_splits=5, n_repeats=2) predictions += clf.predict(test_stack) / (5 * 2) if eval_type == 'regression': print('mean: ',np.sqrt(mean_squared_error(y, oof))) if eval_type == 'binary': print('mean: ',log_loss(y, oof)) # Returns the prediction results for the test set return oof, predictions
write at the end
Ten years of sharpening swords, encouragement with you!
More code: Gitee homepage: https://gitee.com/GZHzzz
Blog homepage: CSDN: https://blog.csdn.net/gzhzzaa
 Fighting!😎
Classic model based on pytorch: A typical agent model based on pytorch
Reinforcement Learning Classic Papers: Reinforcement Learning Classic Papers
while True: Go life