sklearn.inspection
.plot_partial_dependence¶
-
sklearn.inspection.
plot_partial_dependence
(estimator, X, features, feature_names=None, target=None, response_method=’auto’, n_cols=3, grid_resolution=100, percentiles=(0.05, 0.95), method=’auto’, n_jobs=None, verbose=0, fig=None, line_kw=None, contour_kw=None)[source]¶ Partial dependence plots.
The
len(features)
plots are arranged in a grid withn_cols
columns. Two-way partial dependence plots are plotted as contour plots.Read more in the User Guide.
Parameters: - estimator : BaseEstimator
A fitted estimator object implementing predict, predict_proba, or decision_function. Multioutput-multiclass classifiers are not supported.
- X : array-like, shape (n_samples, n_features)
The data to use to build the grid of values on which the dependence will be evaluated. This is usually the training data.
- features : list of {int, str, pair of int, pair of str}
The target features for which to create the PDPs. If features[i] is an int or a string, a one-way PDP is created; if features[i] is a tuple, a two-way PDP is created. Each tuple must be of size 2. if any entry is a string, then it must be in
feature_names
.- feature_names : seq of str, shape (n_features,), optional
Name of each feature; feature_names[i] holds the name of the feature with index i. By default, the name of the feature corresponds to their numerical index.
- target : int, optional (default=None)
- In a multiclass setting, specifies the class for which the PDPs should be computed. Note that for binary classification, the positive class (index 1) is always used.
- In a multioutput setting, specifies the task for which the PDPs should be computed
Ignored in binary classification or classical regression settings.
- response_method : ‘auto’, ‘predict_proba’ or ‘decision_function’, optional (default=’auto’) :
Specifies whether to use predict_proba or decision_function as the target response. For regressors this parameter is ignored and the response is always the output of predict. By default, predict_proba is tried first and we revert to decision_function if it doesn’t exist. If
method
is ‘recursion’, the response is always the output of decision_function.- n_cols : int, optional (default=3)
The maximum number of columns in the grid plot.
- grid_resolution : int, optional (default=100)
The number of equally spaced points on the axes of the plots, for each target feature.
- percentiles : tuple of float, optional (default=(0.05, 0.95))
The lower and upper percentile used to create the extreme values for the PDP axes. Must be in [0, 1].
- method : str, optional (default=’auto’)
The method to use to calculate the partial dependence predictions:
- ‘recursion’ is only supported for objects inheriting from
BaseGradientBoosting
, but is more efficient in terms of speed. With this method,X
is optional and is only used to build the grid and the partial dependences are computed using the training data. This method does not account for theinit
predicor of the boosting process, which may lead to incorrect values (see warning below. With this method, the target response of a classifier is always the decision function, not the predicted probabilities. - ‘brute’ is supported for any estimator, but is more computationally intensive.
- If ‘auto’, then ‘recursion’ will be used for
BaseGradientBoosting
estimators withinit=None
, and ‘brute’ for all other.
Unlike the ‘brute’ method, ‘recursion’ does not account for the
init
predictor of the boosting process. In practice this still produces the same plots, up to a constant offset in the target response.- ‘recursion’ is only supported for objects inheriting from
- n_jobs : int, optional (default=None)
The number of CPUs to use to compute the partial dependences.
None
means 1 unless in ajoblib.parallel_backend
context.-1
means using all processors. See Glossary for more details.- verbose : int, optional (default=0)
Verbose output during PD computations.
- fig : Matplotlib figure object, optional (default=None)
A figure object onto which the plots will be drawn, after the figure has been cleared. By default, a new one is created.
- line_kw : dict, optional
Dict with keywords passed to the
matplotlib.pyplot.plot
call. For one-way partial dependence plots.- contour_kw : dict, optional
Dict with keywords passed to the
matplotlib.pyplot.plot
call. For two-way partial dependence plots.
Warning
The ‘recursion’ method only works for gradient boosting estimators, and unlike the ‘brute’ method, it does not account for the
init
predictor of the boosting process. In practice this will produce the same values as ‘brute’ up to a constant offset in the target response, provided thatinit
is a consant estimator (which is the default). However, as soon asinit
is not a constant estimator, the partial dependence values are incorrect for ‘recursion’.See also
sklearn.inspection.partial_dependence
- Return raw partial dependence values
Examples
>>> from sklearn.datasets import make_friedman1 >>> from sklearn.ensemble import GradientBoostingRegressor >>> X, y = make_friedman1() >>> clf = GradientBoostingRegressor(n_estimators=10).fit(X, y) >>> plot_partial_dependence(clf, X, [0, (0, 1)])