scikit-learn v0.20.dev0 Other versions

Please cite us if you use the software.

General-purpose and introductory examples for scikit-learn.

Plotting Cross-Validated Predictions

Concatenating multiple feature extraction methods

Pipelining: chaining a PCA and a logistic regression

Isotonic Regression

Imputing missing values before building an estimator

Face completion with a multi-output estimators

Selecting dimensionality reduction with Pipeline and GridSearchCV

Multilabel classification

Comparing anomaly detection algorithms for outlier detection on toy datasets

The Johnson-Lindenstrauss bound for embedding with random projections

Comparison of kernel ridge regression and SVR

Feature Union with Heterogeneous Data Sources

Explicit feature map approximation for RBF kernels

Applications to real world problems with some medium sized datasets or interactive user interface.

Outlier detection on a real data set

Compressive sensing: tomography reconstruction with L1 prior (Lasso)

Topic extraction with Non-negative Matrix Factorization and Latent Dirichlet Allocation

Faces recognition example using eigenfaces and SVMs

Model Complexity Influence

Species distribution modeling

Wikipedia principal eigenvector

Visualizing the stock market structure

Libsvm GUI

Prediction Latency

Out-of-core classification of text documents

Examples concerning the sklearn.cluster.bicluster module.

sklearn.cluster.bicluster

A demo of the Spectral Co-Clustering algorithm

A demo of the Spectral Biclustering algorithm

Biclustering documents with the Spectral Co-clustering algorithm

Examples illustrating the calibration of predicted probabilities of classifiers.

Comparison of Calibration of Classifiers

Probability Calibration curves

Probability calibration of classifiers

Probability Calibration for 3-class classification

General examples about classification algorithms.

Recognizing hand-written digits

Normal and Shrinkage Linear Discriminant Analysis for classification

Plot classification probability

Classifier comparison

Linear and Quadratic Discriminant Analysis with covariance ellipsoid

Examples concerning the sklearn.cluster module.

sklearn.cluster

Feature agglomeration

A demo of the mean-shift clustering algorithm

Demonstration of k-means assumptions

Segmenting the picture of a raccoon face in regions

A demo of structured Ward hierarchical clustering on a raccoon face image

Online learning of a dictionary of parts of faces

Vector Quantization Example

Agglomerative clustering with and without structure

Demo of affinity propagation clustering algorithm

Various Agglomerative Clustering on a 2D embedding of digits

K-means Clustering

Spectral clustering for image segmentation

Demo of DBSCAN clustering algorithm

Color Quantization using K-Means

Hierarchical clustering: structured vs unstructured ward

Agglomerative clustering with different metrics

Compare BIRCH and MiniBatchKMeans

Empirical evaluation of the impact of k-means initialization

Adjustment for chance in clustering performance evaluation

A demo of K-Means clustering on the handwritten digits data

Feature agglomeration vs. univariate selection

Comparison of the K-Means and MiniBatchKMeans clustering algorithms

Selecting the number of clusters with silhouette analysis on KMeans clustering

Comparing different clustering algorithms on toy datasets

Examples concerning the sklearn.covariance module.

sklearn.covariance

Ledoit-Wolf vs OAS estimation

Sparse inverse covariance estimation

Shrinkage covariance estimation: LedoitWolf vs OAS and max-likelihood

Robust covariance estimation and Mahalanobis distances relevance

Outlier detection with several methods.

Robust vs Empirical covariance estimate

Examples concerning the sklearn.cross_decomposition module.

sklearn.cross_decomposition

Compare cross decomposition methods

Examples concerning the sklearn.datasets module.

sklearn.datasets

The Digit Dataset

The Iris Dataset

Plot randomly generated classification dataset

Plot randomly generated multilabel dataset

Examples concerning the sklearn.decomposition module.

sklearn.decomposition

Beta-divergence loss functions

PCA example with Iris Data-set

Incremental PCA

Comparison of LDA and PCA 2D projection of Iris dataset

Blind source separation using FastICA

Principal components analysis (PCA)

FastICA on 2D point clouds

Kernel PCA

Sparse coding with a precomputed dictionary

Model selection with Probabilistic PCA and Factor Analysis (FA)

Image denoising using dictionary learning

Faces dataset decompositions

Examples concerning the sklearn.ensemble module.

sklearn.ensemble

Decision Tree Regression with AdaBoost

Pixel importances with a parallel forest of trees

Feature importances with forests of trees

IsolationForest example

Plot the decision boundaries of a VotingClassifier

Comparing random forests and the multi-output meta estimator

Prediction Intervals for Gradient Boosting Regression

Gradient Boosting regularization

Plot class probabilities calculated by the VotingClassifier

Gradient Boosting regression

OOB Errors for Random Forests

Two-class AdaBoost

Hashing feature transformation using Totally Random Trees

Partial Dependence Plots

Discrete versus Real AdaBoost

Multi-class AdaBoosted Decision Trees

Early stopping of Gradient Boosting

Feature transformations with ensembles of trees

Gradient Boosting Out-of-Bag estimates

Single estimator versus bagging: bias-variance decomposition

Plot the decision surfaces of ensembles of trees on the iris dataset

Exercises for the tutorials

Digits Classification Exercise

Cross-validation on Digits Dataset Exercise

SVM Exercise

Cross-validation on diabetes Dataset Exercise

Examples concerning the sklearn.feature_selection module.

sklearn.feature_selection

Recursive feature elimination

Comparison of F-test and mutual information

Pipeline Anova SVM

Recursive feature elimination with cross-validation

Feature selection using SelectFromModel and LassoCV

Test with permutations the significance of a classification score

Univariate Feature Selection

Examples concerning the sklearn.gaussian_process module.

sklearn.gaussian_process

Illustration of Gaussian process classification (GPC) on the XOR dataset

Gaussian process classification (GPC) on iris dataset

Comparison of kernel ridge and Gaussian process regression

Gaussian process regression (GPR) on Mauna Loa CO2 data.

Illustration of prior and posterior Gaussian process for different kernels

Iso-probability lines for Gaussian Processes classification (GPC)

Probabilistic predictions with Gaussian process classification (GPC)

Gaussian process regression (GPR) with noise-level estimation

Gaussian Processes regression: basic introductory example

Examples concerning the sklearn.linear_model module.

sklearn.linear_model

Lasso path using LARS

Plot Ridge coefficients as a function of the regularization

SGD: Maximum margin separating hyperplane

Path with L1- Logistic Regression

SGD: convex loss functions

Plot Ridge coefficients as a function of the L2 regularization

Ordinary Least Squares and Ridge Regression Variance

Logistic function

Polynomial interpolation

Logistic Regression 3-class Classifier

SGD: Weighted samples

Linear Regression Example

Robust linear model estimation using RANSAC

Sparsity Example: Fitting only features 1 and 2

Lasso on dense and sparse data

HuberRegressor vs Ridge on dataset with strong outliers

Comparing various online solvers

SGD: Penalties

Joint feature selection with multi-task Lasso

Lasso and Elastic Net for Sparse Signals

MNIST classfification using multinomial logistic + L1

Orthogonal Matching Pursuit

Plot multi-class SGD on the iris dataset

L1 Penalty and Sparsity in Logistic Regression

Theil-Sen Regression

Plot multinomial and One-vs-Rest Logistic Regression

Robust linear estimator fitting

Lasso and Elastic Net

Automatic Relevance Determination Regression (ARD)

Bayesian Ridge Regression

Lasso model selection: Cross-Validation / AIC / BIC

Multiclass sparse logisitic regression on newgroups20

Examples concerning the sklearn.manifold module.

sklearn.manifold

Swiss Roll reduction with LLE

Multi-dimensional scaling

Comparison of Manifold Learning methods

t-SNE: The effect of various perplexity values on the shape

Manifold Learning methods on a severed sphere

Manifold learning on handwritten digits: Locally Linear Embedding, Isomap…

Examples concerning the sklearn.mixture module.

sklearn.mixture

Density Estimation for a Gaussian mixture

Gaussian Mixture Model Ellipsoids

Gaussian Mixture Model Selection

GMM covariances

Gaussian Mixture Model Sine Curve

Concentration Prior Type Analysis of Variation Bayesian Gaussian Mixture

Examples related to the sklearn.model_selection module.

sklearn.model_selection

Plotting Validation Curves

Underfitting vs. Overfitting

Parameter estimation using grid search with cross-validation

Train error vs Test error

Receiver Operating Characteristic (ROC) with cross validation

Confusion matrix

Comparing randomized search and grid search for hyperparameter estimation

Nested versus non-nested cross-validation

Demonstration of multi-metric evaluation on cross_val_score and GridSearchCV

Sample pipeline for text feature extraction and evaluation

Receiver Operating Characteristic (ROC)

Plotting Learning Curves

Precision-Recall

Examples concerning the sklearn.multioutput module.

sklearn.multioutput

Classifier Chain

Examples concerning the sklearn.neighbors module.

sklearn.neighbors

Anomaly detection with Local Outlier Factor (LOF)

Nearest Neighbors regression

Nearest Neighbors Classification

Nearest Centroid Classification

Kernel Density Estimation

Kernel Density Estimate of Species Distributions

Simple 1D Kernel Density Estimation

Examples concerning the sklearn.neural_network module.

sklearn.neural_network

Visualization of MLP weights on MNIST

Compare Stochastic learning strategies for MLPClassifier

Restricted Boltzmann Machine features for digit classification

Varying regularization in Multi-layer Perceptron

Examples concerning the sklearn.preprocessing module.

sklearn.preprocessing

Using FunctionTransformer to select columns

Importance of Feature Scaling

Compare the effect of different scalers on data with outliers

Examples concerning the sklearn.semi_supervised module.

sklearn.semi_supervised

Decision boundary of label propagation versus SVM on the Iris dataset

Label Propagation learning a complex structure

Label Propagation digits: Demonstrating performance

Label Propagation digits active learning

Examples concerning the sklearn.svm module.

sklearn.svm

Non-linear SVM

SVM: Maximum margin separating hyperplane

Support Vector Regression (SVR) using linear and non-linear kernels

SVM with custom kernel

SVM: Weighted samples

SVM: Separating hyperplane for unbalanced classes

SVM-Kernels

SVM-Anova: SVM with univariate feature selection

SVM Margins Example

One-class SVM with non-linear kernel (RBF)

Plot different SVM classifiers in the iris dataset

Scaling the regularization parameter for SVCs

RBF SVM parameters

Examples concerning the sklearn.feature_extraction.text module.

sklearn.feature_extraction.text

FeatureHasher and DictVectorizer Comparison

Clustering text documents using k-means

Classification of text documents using sparse features

Examples concerning the sklearn.tree module.

sklearn.tree

Decision Tree Regression

Multi-output Decision Tree Regression

Plot the decision surface of a decision tree on the iris dataset

Understanding the decision tree structure

Gallery generated by Sphinx-Gallery