`sklearn.ensemble`.StackingClassifier¶

class sklearn.ensemble.StackingClassifier(estimators, final_estimator=None, *, cv=None, stack_method='auto', n_jobs=None, passthrough=False, verbose=0)[source]¶

Stack of estimators with a final classifier.

Stacked generalization consists in stacking the output of individual estimator and use a classifier to compute the final prediction. Stacking allows to use the strength of each individual estimator by using their output as input of a final estimator.

Note that estimators_ are fitted on the full X while final_estimator_ is trained using cross-validated predictions of the base estimators using cross_val_predict.

See also

StackingRegressor: Stack of estimators with a final regressor.

Notes

When predict_proba is used by each estimator (i.e. most of the time for stack_method='auto' or specifically for stack_method='predict_proba'), The first column predicted by each estimator will be dropped in the case of a binary classification problem. Indeed, both feature will be perfectly collinear.

References

[1]

Wolpert, David H. “Stacked generalization.” Neural networks 5.2 (1992): 241-259.

Examples

>>> from sklearn.datasets import load_iris
>>> from sklearn.ensemble import RandomForestClassifier
>>> from sklearn.svm import LinearSVC
>>> from sklearn.linear_model import LogisticRegression
>>> from sklearn.preprocessing import StandardScaler
>>> from sklearn.pipeline import make_pipeline
>>> from sklearn.ensemble import StackingClassifier
>>> X, y = load_iris(return_X_y=True)
>>> estimators = [
...     ('rf', RandomForestClassifier(n_estimators=10, random_state=42)),
...     ('svr', make_pipeline(StandardScaler(),
...                           LinearSVC(random_state=42)))
... ]
>>> clf = StackingClassifier(
...     estimators=estimators, final_estimator=LogisticRegression()
... )
>>> from sklearn.model_selection import train_test_split
>>> X_train, X_test, y_train, y_test = train_test_split(
...     X, y, stratify=y, random_state=42
... )
>>> clf.fit(X_train, y_train).score(X_test, y_test)
0.9...

Methods

`decision_function`(X)	Decision function for samples in `X` using the final estimator.
`fit`(X, y[, sample_weight])	Fit the estimators.
`fit_transform`(X[, y])	Fit to data, then transform it.
`get_feature_names_out`([input_features])	Get output feature names for transformation.
`get_params`([deep])	Get the parameters of an estimator from the ensemble.
`predict`(X, **predict_params)	Predict target for X.
`predict_proba`(X)	Predict class probabilities for `X` using the final estimator.
`score`(X, y[, sample_weight])	Return the mean accuracy on the given test data and labels.
`set_params`(**params)	Set the parameters of an estimator from the ensemble.
`transform`(X)	Return class labels or probabilities for X for each estimator.

decision_function(X)[source]¶

Decision function for samples in X using the final estimator.

Parameters:

X{array-like, sparse matrix} of shape (n_samples, n_features): Training vectors, where n_samples is the number of samples and n_features is the number of features.

Returns:

decisionsndarray of shape (n_samples,), (n_samples, n_classes), or (n_samples, n_classes * (n_classes-1) / 2): The decision function computed the final estimator.

fit(X, y, sample_weight=None)[source]¶

Fit the estimators.

Parameters:

X{array-like, sparse matrix} of shape (n_samples, n_features): Training vectors, where n_samples is the number of samples and n_features is the number of features.
yarray-like of shape (n_samples,): Target values.
sample_weightarray-like of shape (n_samples,), default=None: Sample weights. If None, then samples are equally weighted. Note that this is supported only if all underlying estimators support sample weights.

Returns:

selfobject: Returns a fitted instance of estimator.

fit_transform(X, y=None, **fit_params)[source]¶

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:

Xarray-like of shape (n_samples, n_features): Input samples.
yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None: Target values (None for unsupervised transformations).
**fit_paramsdict: Additional fit parameters.

Returns:

X_newndarray array of shape (n_samples, n_features_new): Transformed array.

get_feature_names_out(input_features=None)[source]¶

Get output feature names for transformation.

Parameters:

input_featuresarray-like of str or None, default=None

Input features. The input feature names are only used when passthrough is True.

If input_features is None, then feature_names_in_ is used as feature names in. If feature_names_in_ is not defined, then names are generated: [x0, x1, ..., x(n_features_in_ - 1)].
If input_features is an array-like, then input_features must match feature_names_in_ if feature_names_in_ is defined.

If passthrough is False, then only the names of estimators are used to generate the output feature names.

Returns:

feature_names_outndarray of str objects: Transformed feature names.

get_params(deep=True)[source]¶

Get the parameters of an estimator from the ensemble.

Returns the parameters given in the constructor as well as the estimators contained within the estimators parameter.

Parameters:

deepbool, default=True: Setting it to True gets the various estimators and the parameters of the estimators as well.

Returns:

paramsdict: Parameter and estimator names mapped to their values or parameter names mapped to their values.

property n_features_in_¶: Number of features seen during fit.

property named_estimators¶

Dictionary to access any fitted sub-estimators by name.

Returns:

Bunch

predict(X, **predict_params)[source]¶

Predict target for X.

Parameters:

X{array-like, sparse matrix} of shape (n_samples, n_features): Training vectors, where n_samples is the number of samples and n_features is the number of features.
**predict_paramsdict of str -> obj: Parameters to the predict called by the final_estimator. Note that this may be used to return uncertainties from some estimators with return_std or return_cov. Be aware that it will only accounts for uncertainty in the final estimator.

Returns:

y_predndarray of shape (n_samples,) or (n_samples, n_output): Predicted targets.

predict_proba(X)[source]¶

Predict class probabilities for X using the final estimator.

Parameters:

X{array-like, sparse matrix} of shape (n_samples, n_features): Training vectors, where n_samples is the number of samples and n_features is the number of features.

Returns:

probabilitiesndarray of shape (n_samples, n_classes) or list of ndarray of shape (n_output,): The class probabilities of the input samples.

score(X, y, sample_weight=None)[source]¶

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters:

Xarray-like of shape (n_samples, n_features): Test samples.
yarray-like of shape (n_samples,) or (n_samples, n_outputs): True labels for X.
sample_weightarray-like of shape (n_samples,), default=None: Sample weights.

Returns:

scorefloat: Mean accuracy of self.predict(X) wrt. y.

set_params(**params)[source]¶

Set the parameters of an estimator from the ensemble.

Valid parameter keys can be listed with get_params(). Note that you can directly set the parameters of the estimators contained in estimators.

Parameters:

**paramskeyword arguments: Specific parameters using e.g. set_params(parameter_name=new_value). In addition, to setting the parameters of the estimator, the individual estimator of the estimators can also be set, or can be removed by setting them to ‘drop’.

Returns:

selfobject: Estimator instance.

transform(X)[source]¶

Return class labels or probabilities for X for each estimator.

Parameters:

X{array-like, sparse matrix} of shape (n_samples, n_features): Training vectors, where n_samples is the number of samples and n_features is the number of features.

Returns:

y_predsndarray of shape (n_samples, n_estimators) or (n_samples, n_classes * n_estimators): Prediction outputs for each estimator.

Examples using `sklearn.ensemble.StackingClassifier`¶

Release Highlights for scikit-learn 0.22

sklearn.ensemble.StackingClassifier¶

Examples using sklearn.ensemble.StackingClassifier¶

`sklearn.ensemble`.StackingClassifier¶

Examples using `sklearn.ensemble.StackingClassifier`¶