sklearn.compose.make_column_selector

sklearn.compose.make_column_selector(pattern=None, *, dtype_include=None, dtype_exclude=None)[source]

Create a callable to select columns to be used with ColumnTransformer.

make_column_selector can select columns based on datatype or the columns name with a regex. When using multiple selection criteria, all criteria must match for a column to be selected.

For an example of how to use make_column_selector within a ColumnTransformer to select columns based on data type (i.e. dtype), refer to Column Transformer with Mixed Types.

Parameters:
patternstr, default=None

Name of columns containing this regex pattern will be included. If None, column selection will not be selected based on pattern.

dtype_includecolumn dtype or list of column dtypes, default=None

A selection of dtypes to include. For more details, see pandas.DataFrame.select_dtypes.

dtype_excludecolumn dtype or list of column dtypes, default=None

A selection of dtypes to exclude. For more details, see pandas.DataFrame.select_dtypes.

Returns:
selectorcallable

Callable for column selection to be used by a ColumnTransformer.

See also

ColumnTransformer

Class that allows combining the outputs of multiple transformer objects used on column subsets of the data into a single feature space.

Examples

>>> from sklearn.preprocessing import StandardScaler, OneHotEncoder
>>> from sklearn.compose import make_column_transformer
>>> from sklearn.compose import make_column_selector
>>> import numpy as np
>>> import pandas as pd  
>>> X = pd.DataFrame({'city': ['London', 'London', 'Paris', 'Sallisaw'],
...                   'rating': [5, 3, 4, 5]})  
>>> ct = make_column_transformer(
...       (StandardScaler(),
...        make_column_selector(dtype_include=np.number)),  # rating
...       (OneHotEncoder(),
...        make_column_selector(dtype_include=object)))  # city
>>> ct.fit_transform(X)  
array([[ 0.90453403,  1.        ,  0.        ,  0.        ],
       [-1.50755672,  1.        ,  0.        ,  0.        ],
       [-0.30151134,  0.        ,  1.        ,  0.        ],
       [ 0.90453403,  0.        ,  0.        ,  1.        ]])

Examples using sklearn.compose.make_column_selector

Categorical Feature Support in Gradient Boosting

Categorical Feature Support in Gradient Boosting

Combine predictors using stacking

Combine predictors using stacking

Evaluation of outlier detection estimators

Evaluation of outlier detection estimators

Column Transformer with Mixed Types

Column Transformer with Mixed Types