- 1. Supervised learning
- 1.1. Generalized Linear Models
- 1.1.1. Ordinary Least Squares
- 1.1.2. Ridge Regression
- 1.1.3. Lasso
- 1.1.4. Elastic Net
- 1.1.5. Multi-task Lasso
- 1.1.6. Least Angle Regression
- 1.1.7. LARS Lasso
- 1.1.8. Orthogonal Matching Pursuit (OMP)
- 1.1.9. Bayesian Regression
- 1.1.10. Logistic regression
- 1.1.11. Stochastic Gradient Descent - SGD
- 1.1.12. Perceptron
- 1.1.13. Passive Aggressive Algorithms
- 1.1.14. Robustness to outliers: RANSAC
- 1.1.15. Polynomial regression: extending linear models with basis functions
- 1.2. Support Vector Machines
- 1.3. Stochastic Gradient Descent
- 1.4. Nearest Neighbors
- 1.5. Gaussian Processes
- 1.6. Cross decomposition
- 1.7. Naive Bayes
- 1.8. Decision Trees
- 1.9. Ensemble methods
- 1.9.1. Bagging meta-estimator
- 1.9.2. Forests of randomized trees
- 1.9.3. AdaBoost
- 1.9.4. Gradient Tree Boosting
- 1.10. Multiclass and multilabel algorithms
- 1.11. Feature selection
- 1.12. Semi-Supervised
- 1.13. Linear and quadratic discriminant analysis
- 1.14. Isotonic regression
- 1.1. Generalized Linear Models
- 2. Unsupervised learning
- 2.1. Gaussian mixture models
- 2.2. Manifold learning
- 2.2.1. Introduction
- 2.2.2. Isomap
- 2.2.3. Locally Linear Embedding
- 2.2.4. Modified Locally Linear Embedding
- 2.2.5. Hessian Eigenmapping
- 2.2.6. Spectral Embedding
- 2.2.7. Local Tangent Space Alignment
- 2.2.8. Multi-dimensional Scaling (MDS)
- 2.2.9. t-distributed Stochastic Neighbor Embedding (t-SNE)
- 2.2.10. Tips on practical use
- 2.3. Clustering
- 2.3.1. Overview of clustering methods
- 2.3.2. K-means
- 2.3.3. Affinity Propagation
- 2.3.4. Mean Shift
- 2.3.5. Spectral clustering
- 2.3.6. Hierarchical clustering
- 2.3.7. DBSCAN
- 2.3.8. Clustering performance evaluation
- 2.4. Biclustering
- 2.5. Decomposing signals in components (matrix factorization problems)
- 2.6. Covariance estimation
- 2.7. Novelty and Outlier Detection
- 2.8. Density Estimation
- 2.9. Neural network models (unsupervised)
- 3. Model selection and evaluation
- 3.1. Cross-validation: evaluating estimator performance
- 3.2. Grid Search: Searching for estimator parameters
- 3.2.1. Exhaustive Grid Search
- 3.2.2. Randomized Parameter Optimization
- 3.2.3. Alternatives to brute force parameter search
- 3.2.3.1. Model specific cross-validation
- 3.2.3.2. Information Criterion
- 3.2.3.3. Out of Bag Estimates
- 3.2.3.3.1. sklearn.ensemble.RandomForestClassifier
- 3.2.3.3.2. sklearn.ensemble.RandomForestRegressor
- 3.2.3.3.3. sklearn.ensemble.ExtraTreesClassifier
- 3.2.3.3.4. sklearn.ensemble.ExtraTreesRegressor
- 3.2.3.3.5. sklearn.ensemble.GradientBoostingClassifier
- 3.2.3.3.6. sklearn.ensemble.GradientBoostingRegressor
- 3.3. Pipeline: chaining estimators
- 3.4. FeatureUnion: Combining feature extractors
- 3.5. Model evaluation: quantifying the quality of predictions
- 3.5.1. The scoring parameter: defining model evaluation rules
- 3.5.2. Function for prediction-error metrics
- 3.5.2.1. Classification metrics
- 3.5.2.1.1. Accuracy score
- 3.5.2.1.2. Confusion matrix
- 3.5.2.1.3. Classification report
- 3.5.2.1.4. Hamming loss
- 3.5.2.1.5. Jaccard similarity coefficient score
- 3.5.2.1.6. Precision, recall and F-measures
- 3.5.2.1.7. Hinge loss
- 3.5.2.1.8. Log loss
- 3.5.2.1.9. Matthews correlation coefficient
- 3.5.2.1.10. Receiver operating characteristic (ROC)
- 3.5.2.1.11. Zero one loss
- 3.5.2.2. Regression metrics
- 3.5.2.1. Classification metrics
- 3.5.3. Clustering metrics
- 3.5.4. Biclustering metrics
- 3.5.5. Dummy estimators
- 3.6. Model persistence
- 3.7. Validation curves: plotting scores to evaluate models
- 4. Dataset transformations
- 4.1. Feature extraction
- 4.1.1. Loading features from dicts
- 4.1.2. Feature hashing
- 4.1.3. Text feature extraction
- 4.1.3.1. The Bag of Words representation
- 4.1.3.2. Sparsity
- 4.1.3.3. Common Vectorizer usage
- 4.1.3.4. Tf–idf term weighting
- 4.1.3.5. Decoding text files
- 4.1.3.6. Applications and examples
- 4.1.3.7. Limitations of the Bag of Words representation
- 4.1.3.8. Vectorizing a large text corpus with the hashing trick
- 4.1.3.9. Performing out-of-core scaling with HashingVectorizer
- 4.1.3.10. Customizing the vectorizer classes
- 4.1.4. Image feature extraction
- 4.2. Preprocessing data
- 4.3. Kernel Approximation
- 4.4. Random Projection
- 4.5. Pairwise metrics, Affinities and Kernels
- 4.1. Feature extraction
- 5. Dataset loading utilities
- 5.1. General dataset API
- 5.2. Toy datasets
- 5.3. Sample images
- 5.4. Sample generators
- 5.5. Datasets in svmlight / libsvm format
- 5.6. The Olivetti faces dataset
- 5.7. The 20 newsgroups text dataset
- 5.8. Downloading datasets from the mldata.org repository
- 5.9. The Labeled Faces in the Wild face recognition dataset
- 5.10. Forest covertypes
- 6. Strategies to scale computationally: bigger data
- 7. Computational Performance