Note

Go to the end to download the full example code. or to run this example in your browser via JupyterLite or Binder

Displaying Pipelines#

The default configuration for displaying a pipeline in a Jupyter Notebook is 'diagram' where set_config(display='diagram'). To deactivate HTML representation, use set_config(display='text').

To see more detailed steps in the visualization of the pipeline, click on the steps in the pipeline.

# Authors: The scikit-learn developers
# SPDX-License-Identifier: BSD-3-Clause

Displaying a Pipeline with a Preprocessing Step and Classifier#

This section constructs a Pipeline with a preprocessing step, StandardScaler, and classifier, LogisticRegression, and displays its visual representation.

from sklearn import set_config
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

steps = [
    ("preprocessing", StandardScaler()),
    ("classifier", LogisticRegression()),
]
pipe = Pipeline(steps)

To visualize the diagram, the default is display='diagram'.

set_config(display="diagram")
pipe  # click on the diagram below to see the details of each step

Pipeline(steps=[('preprocessing', StandardScaler()),
                ('classifier', LogisticRegression())])

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

To view the text pipeline, change to display='text'.

set_config(display="text")
pipe

Pipeline(steps=[('preprocessing', StandardScaler()),
                ('classifier', LogisticRegression())])

Put back the default display

set_config(display="diagram")

Displaying a Pipeline Chaining Multiple Preprocessing Steps & Classifier#

This section constructs a Pipeline with multiple preprocessing steps, PolynomialFeatures and StandardScaler, and a classifier step, LogisticRegression, and displays its visual representation.

from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import PolynomialFeatures, StandardScaler

steps = [
    ("standard_scaler", StandardScaler()),
    ("polynomial", PolynomialFeatures(degree=3)),
    ("classifier", LogisticRegression(C=2.0)),
]
pipe = Pipeline(steps)
pipe  # click on the diagram below to see the details of each step

Pipeline(steps=[('standard_scaler', StandardScaler()),
                ('polynomial', PolynomialFeatures(degree=3)),
                ('classifier', LogisticRegression(C=2.0))])

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Displaying a Pipeline and Dimensionality Reduction and Classifier#

This section constructs a Pipeline with a dimensionality reduction step, PCA, a classifier, SVC, and displays its visual representation.

from sklearn.decomposition import PCA
from sklearn.pipeline import Pipeline
from sklearn.svm import SVC

steps = [("reduce_dim", PCA(n_components=4)), ("classifier", SVC(kernel="linear"))]
pipe = Pipeline(steps)
pipe  # click on the diagram below to see the details of each step

Pipeline(steps=[('reduce_dim', PCA(n_components=4)),
                ('classifier', SVC(kernel='linear'))])

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Displaying a Complex Pipeline Chaining a Column Transformer#

This section constructs a complex Pipeline with a ColumnTransformer and a classifier, LogisticRegression, and displays its visual representation.

import numpy as np

from sklearn.compose import ColumnTransformer
from sklearn.impute import SimpleImputer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline, make_pipeline
from sklearn.preprocessing import OneHotEncoder, StandardScaler

numeric_preprocessor = Pipeline(
    steps=[
        ("imputation_mean", SimpleImputer(missing_values=np.nan, strategy="mean")),
        ("scaler", StandardScaler()),
    ]
)

categorical_preprocessor = Pipeline(
    steps=[
        (
            "imputation_constant",
            SimpleImputer(fill_value="missing", strategy="constant"),
        ),
        ("onehot", OneHotEncoder(handle_unknown="ignore")),
    ]
)

preprocessor = ColumnTransformer(
    [
        ("categorical", categorical_preprocessor, ["state", "gender"]),
        ("numerical", numeric_preprocessor, ["age", "weight"]),
    ]
)

pipe = make_pipeline(preprocessor, LogisticRegression(max_iter=500))
pipe  # click on the diagram below to see the details of each step

Pipeline(steps=[('columntransformer',
                 ColumnTransformer(transformers=[('categorical',
                                                  Pipeline(steps=[('imputation_constant',
                                                                   SimpleImputer(fill_value='missing',
                                                                                 strategy='constant')),
                                                                  ('onehot',
                                                                   OneHotEncoder(handle_unknown='ignore'))]),
                                                  ['state', 'gender']),
                                                 ('numerical',
                                                  Pipeline(steps=[('imputation_mean',
                                                                   SimpleImputer()),
                                                                  ('scaler',
                                                                   StandardScaler())]),
                                                  ['age', 'weight'])])),
                ('logisticregression', LogisticRegression(max_iter=500))])

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Pipeline

?Documentation for PipelineiNot fitted

Pipeline(steps=[('columntransformer',
                 ColumnTransformer(transformers=[('categorical',
                                                  Pipeline(steps=[('imputation_constant',
                                                                   SimpleImputer(fill_value='missing',
                                                                                 strategy='constant')),
                                                                  ('onehot',
                                                                   OneHotEncoder(handle_unknown='ignore'))]),
                                                  ['state', 'gender']),
                                                 ('numerical',
                                                  Pipeline(steps=[('imputation_mean',
                                                                   SimpleImputer()),
                                                                  ('scaler',
                                                                   StandardScaler())]),
                                                  ['age', 'weight'])])),
                ('logisticregression', LogisticRegression(max_iter=500))])

columntransformer: ColumnTransformer

?Documentation for columntransformer: ColumnTransformer

ColumnTransformer(transformers=[('categorical',
                                 Pipeline(steps=[('imputation_constant',
                                                  SimpleImputer(fill_value='missing',
                                                                strategy='constant')),
                                                 ('onehot',
                                                  OneHotEncoder(handle_unknown='ignore'))]),
                                 ['state', 'gender']),
                                ('numerical',
                                 Pipeline(steps=[('imputation_mean',
                                                  SimpleImputer()),
                                                 ('scaler', StandardScaler())]),
                                 ['age', 'weight'])])

categorical

['state', 'gender']

SimpleImputer

?Documentation for SimpleImputer

SimpleImputer(fill_value='missing', strategy='constant')

OneHotEncoder

?Documentation for OneHotEncoder

OneHotEncoder(handle_unknown='ignore')

numerical

['age', 'weight']

SimpleImputer

?Documentation for SimpleImputer

SimpleImputer()

StandardScaler

?Documentation for StandardScaler

StandardScaler()

LogisticRegression

?Documentation for LogisticRegression

LogisticRegression(max_iter=500)

Displaying Pipelines#

Displaying a Pipeline with a Preprocessing Step and Classifier#

Displaying a Pipeline Chaining Multiple Preprocessing Steps & Classifier#

Displaying a Pipeline and Dimensionality Reduction and Classifier#

Displaying a Complex Pipeline Chaining a Column Transformer#

Displaying a Grid Search over a Pipeline with a Classifier#