This is documentation for an old release of Scikit-learn (version 1.0). Try the latest stable release (version 1.6) or development (unstable) versions.
Note
Click here to download the full example code or to run this example in your browser via Binder
Displaying Pipelines¶
The default configuration for displaying a pipeline is 'text'
where
set_config(display='text')
. To visualize the diagram in Jupyter Notebook,
use set_config(display='diagram')
and then output the pipeline object.
To see more detailed steps in the visualization of the pipeline, click on the steps in the pipeline.
Displaying a Pipeline with a Preprocessing Step and Classifier¶
This section constructs a
Pipeline
with a preprocessing step,StandardScaler
, and classifier,LogisticRegression
, and displays its visual representation.
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn import set_config
steps = [
("preprocessing", StandardScaler()),
("classifier", LogisticRegression()),
]
pipe = Pipeline(steps)
To view the text pipeline, the default is display='text'
.
set_config(display="text")
pipe
Out:
Pipeline(steps=[('preprocessing', StandardScaler()),
('classifier', LogisticRegression())])
To visualize the diagram, change display='diagram'
.
set_config(display="diagram")
pipe # click on the diagram below to see the details of each step
Displaying a Pipeline Chaining Multiple Preprocessing Steps & Classifier¶
This section constructs a
Pipeline
with multiple preprocessing steps,PolynomialFeatures
andStandardScaler
, and a classifer step,LogisticRegression
, and displays its visual representation.
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, PolynomialFeatures
from sklearn.linear_model import LogisticRegression
from sklearn import set_config
steps = [
("standard_scaler", StandardScaler()),
("polynomial", PolynomialFeatures(degree=3)),
("classifier", LogisticRegression(C=2.0)),
]
pipe = Pipeline(steps)
To visualize the diagram, change to display=’diagram’
set_config(display="diagram")
pipe # click on the diagram below to see the details of each step
Displaying a Pipeline and Dimensionality Reduction and Classifier¶
To visualize the diagram, change to display='diagram'
.
set_config(display="diagram")
pipe # click on the diagram below to see the details of each step
Displaying a Complex Pipeline Chaining a Column Transformer¶
This section constructs a complex
Pipeline
with aColumnTransformer
and a classifier,LogisticRegression
, and displays its visual representation.
import numpy as np
from sklearn.pipeline import make_pipeline
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn import set_config
numeric_preprocessor = Pipeline(
steps=[
("imputation_mean", SimpleImputer(missing_values=np.nan, strategy="mean")),
("scaler", StandardScaler()),
]
)
categorical_preprocessor = Pipeline(
steps=[
(
"imputation_constant",
SimpleImputer(fill_value="missing", strategy="constant"),
),
("onehot", OneHotEncoder(handle_unknown="ignore")),
]
)
preprocessor = ColumnTransformer(
[
("categorical", categorical_preprocessor, ["state", "gender"]),
("numerical", numeric_preprocessor, ["age", "weight"]),
]
)
pipe = make_pipeline(preprocessor, LogisticRegression(max_iter=500))
To visualize the diagram, change to display='diagram'
set_config(display="diagram")
pipe # click on the diagram below to see the details of each step
Displaying a Grid Search over a Pipeline with a Classifier¶
This section constructs a
GridSearchCV
over aPipeline
withRandomForestClassifier
and displays its visual representation.
import numpy as np
from sklearn.pipeline import make_pipeline
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import GridSearchCV
from sklearn import set_config
numeric_preprocessor = Pipeline(
steps=[
("imputation_mean", SimpleImputer(missing_values=np.nan, strategy="mean")),
("scaler", StandardScaler()),
]
)
categorical_preprocessor = Pipeline(
steps=[
(
"imputation_constant",
SimpleImputer(fill_value="missing", strategy="constant"),
),
("onehot", OneHotEncoder(handle_unknown="ignore")),
]
)
preprocessor = ColumnTransformer(
[
("categorical", categorical_preprocessor, ["state", "gender"]),
("numerical", numeric_preprocessor, ["age", "weight"]),
]
)
pipe = Pipeline(
steps=[("preprocessor", preprocessor), ("classifier", RandomForestClassifier())]
)
param_grid = {
"classifier__n_estimators": [200, 500],
"classifier__max_features": ["auto", "sqrt", "log2"],
"classifier__max_depth": [4, 5, 6, 7, 8],
"classifier__criterion": ["gini", "entropy"],
}
grid_search = GridSearchCV(pipe, param_grid=param_grid, n_jobs=1)
To visualize the diagram, change to display='diagram'
.
set_config(display="diagram")
grid_search # click on the diagram below to see the details of each step
Total running time of the script: ( 0 minutes 0.073 seconds)