This is documentation for an old release of Scikit-learn (version 0.21). Try the latest stable release (version 1.6) or development (unstable) versions.

`sklearn.dummy`.DummyClassifier¶

class sklearn.dummy.DummyClassifier(strategy=’stratified’, random_state=None, constant=None)[source]¶

DummyClassifier is a classifier that makes predictions using simple rules.

This classifier is useful as a simple baseline to compare with other (real) classifiers. Do not use it for real problems.

Read more in the User Guide.

Parameters:

strategy : str, default=”stratified”

Strategy to use to generate predictions.

“stratified”: generates predictions by respecting the training set’s class distribution.
“most_frequent”: always predicts the most frequent label in the training set.
“prior”: always predicts the class that maximizes the class prior (like “most_frequent”) and predict_proba returns the class prior.
“uniform”: generates predictions uniformly at random.
“constant”: always predicts a constant label that is provided by the user. This is useful for metrics that evaluate a non-majority class

New in version 0.17: Dummy Classifier now supports prior fitting strategy using parameter prior.

random_state : int, RandomState instance or None, optional, default=None

If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

constant : int or str or array of shape = [n_outputs]

The explicit constant as predicted by the “constant” strategy. This parameter is useful only for the “constant” strategy.

Attributes:

classes_ : array or list of array of shape = [n_classes]: Class labels for each output.
n_classes_ : array or list of array of shape = [n_classes]: Number of label for each output.
class_prior_ : array or list of array of shape = [n_classes]: Probability of each class for each output.
n_outputs_ : int,: Number of outputs.
sparse_output_ : bool,: True if the array returned from predict is to be in sparse CSC format. Is automatically set to True if the input y is passed in sparse format.

Methods

`fit`(self, X, y[, sample_weight])	Fit the random classifier.
`get_params`(self[, deep])	Get parameters for this estimator.
`predict`(self, X)	Perform classification on test vectors X.
`predict_log_proba`(self, X)	Return log probability estimates for the test vectors X.
`predict_proba`(self, X)	Return probability estimates for the test vectors X.
`score`(self, X, y[, sample_weight])	Returns the mean accuracy on the given test data and labels.
`set_params`(self, \\params)	Set the parameters of this estimator.

__init__(self, strategy=’stratified’, random_state=None, constant=None)[source]¶

fit(self, X, y, sample_weight=None)[source]¶

Fit the random classifier.

Parameters:	X : {array-like, object with finite length or shape} Training data, requires length = n_samples y : array-like, shape = [n_samples] or [n_samples, n_outputs] Target values. sample_weight : array-like of shape = [n_samples], optional Sample weights.
Returns:	self : object

get_params(self, deep=True)[source]¶

Get parameters for this estimator.

Parameters:	deep : boolean, optional If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:	params : mapping of string to any Parameter names mapped to their values.

predict(self, X)[source]¶

Perform classification on test vectors X.

Parameters:	X : {array-like, object with finite length or shape} Training data, requires length = n_samples
Returns:	y : array, shape = [n_samples] or [n_samples, n_outputs] Predicted target values for X.

predict_log_proba(self, X)[source]¶

Return log probability estimates for the test vectors X.

Parameters:	X : {array-like, object with finite length or shape} Training data, requires length = n_samples
Returns:	P : array-like or list of array-like of shape = [n_samples, n_classes] Returns the log probability of the sample for each class in the model, where classes are ordered arithmetically for each output.

predict_proba(self, X)[source]¶

Return probability estimates for the test vectors X.

Parameters:	X : {array-like, object with finite length or shape} Training data, requires length = n_samples
Returns:	P : array-like or list of array-lke of shape = [n_samples, n_classes] Returns the probability of the sample for each class in the model, where classes are ordered arithmetically, for each output.

score(self, X, y, sample_weight=None)[source]¶

Returns the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters:

X : {array-like, None}: Test samples with shape = (n_samples, n_features) or None. Passing None as test samples gives the same result as passing real test samples, since DummyClassifier operates independently of the sampled observations.
y : array-like, shape = (n_samples) or (n_samples, n_outputs): True labels for X.
sample_weight : array-like, shape = [n_samples], optional: Sample weights.

Returns:

score : float: Mean accuracy of self.predict(X) wrt. y.

set_params(self, **params)[source]¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns:	self

sklearn.dummy.DummyClassifier¶

`sklearn.dummy`.DummyClassifier¶