Ensemble learning¶

Random forests was an example of ensemble learning
It just means we use multiple models to try and solve the same problem, and let them vote on the results.
Random Forests uses bagging (bootstrap aggregating) to implement ensemble learning.
- Many models are built by training or randomly-drawn subsets of the data.
Boosting is an alternate technique where each subsequent model in the ensemble boosts attributes that address data mis-classified by the previous model
A bucket of models trains several different models using training data and picks the one that works best with the test data
Stacking runs multiple models at once on the data and combines the results together
- This is how the netflix prize was won.

Advanced Ensemble learning¶

Bayes Optimal Classifier
- Theoretically the best - but always impractical
Bayesian Parameter Averaging
- Attemts to make BOC practical - but it's still misunderstood, susceptible to overfitting and often outperformed by the simpler bagging approach
Bayesian Model Combination
- Tries to address all of those problem
- But in the end, it's about the same as using cross-validation to find the best combination of models