3. Model selection and evaluation¶
- 3.1. Cross-validation: evaluating estimator performance
- 3.2. Grid Search: Searching for estimator parameters
- 3.2.1. Exhaustive Grid Search
- 3.2.2. Randomized Parameter Optimization
- 3.2.3. Alternatives to brute force parameter search
- 3.2.3.1. Model specific cross-validation
- 3.2.3.2. Information Criterion
- 3.2.3.3. Out of Bag Estimates
- 3.2.3.3.1. sklearn.ensemble.RandomForestClassifier
- 3.2.3.3.2. sklearn.ensemble.RandomForestRegressor
- 3.2.3.3.3. sklearn.ensemble.ExtraTreesClassifier
- 3.2.3.3.4. sklearn.ensemble.ExtraTreesRegressor
- 3.2.3.3.5. sklearn.ensemble.GradientBoostingClassifier
- 3.2.3.3.6. sklearn.ensemble.GradientBoostingRegressor
- 3.3. Pipeline: chaining estimators
- 3.4. FeatureUnion: Combining feature extractors
- 3.5. Model evaluation: quantifying the quality of predictions
- 3.5.1. The scoring parameter: defining model evaluation rules
- 3.5.2. Function for prediction-error metrics
- 3.5.2.1. Classification metrics
- 3.5.2.1.1. Accuracy score
- 3.5.2.1.2. Confusion matrix
- 3.5.2.1.3. Classification report
- 3.5.2.1.4. Hamming loss
- 3.5.2.1.5. Jaccard similarity coefficient score
- 3.5.2.1.6. Precision, recall and F-measures
- 3.5.2.1.7. Hinge loss
- 3.5.2.1.8. Log loss
- 3.5.2.1.9. Matthews correlation coefficient
- 3.5.2.1.10. Receiver operating characteristic (ROC)
- 3.5.2.1.11. Zero one loss
- 3.5.2.2. Regression metrics
- 3.5.2.1. Classification metrics
- 3.5.3. Clustering metrics
- 3.5.4. Biclustering metrics
- 3.5.5. Dummy estimators
- 3.6. Model persistence
- 3.7. Validation curves: plotting scores to evaluate models