sklearn.multiclass.OneVsOneClassifier

class sklearn.multiclass.OneVsOneClassifier(estimator, n_jobs=1)[source]

One-vs-one multiclass strategy

This strategy consists in fitting one classifier per class pair. At prediction time, the class which received the most votes is selected. Since it requires to fit n_classes * (n_classes - 1) / 2 classifiers, this method is usually slower than one-vs-the-rest, due to its O(n_classes^2) complexity. However, this method may be advantageous for algorithms such as kernel algorithms which don’t scale well with n_samples. This is because each individual learning problem only involves a small subset of the data whereas, with one-vs-the-rest, the complete dataset is used n_classes times.

Parameters:

estimator : estimator object

An estimator object implementing fit and one of decision_function or predict_proba.

n_jobs : int, optional, default: 1

The number of jobs to use for the computation. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debugging. For n_jobs below -1, (n_cpus + 1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one are used.

Attributes:

estimators_ : list of n_classes * (n_classes - 1) / 2 estimators

Estimators used for predictions.

classes_ : numpy array of shape [n_classes]

Array containing labels.

Methods

decision_function(X) Decision function for the OneVsOneClassifier.
fit(X, y) Fit underlying estimators.
predict(X) Estimate the best class label for each sample in X.
__init__(estimator, n_jobs=1)[source]
decision_function(X)[source]

Decision function for the OneVsOneClassifier.

The decision values for the samples are computed by adding the normalized sum of pair-wise classification confidence levels to the votes in order to disambiguate between the decision values when the votes for all the classes are equal leading to a tie.

Parameters:X : array-like, shape = [n_samples, n_features]
Returns:Y : array-like, shape = [n_samples, n_classes]
fit(X, y)[source]

Fit underlying estimators.

Parameters:

X : (sparse) array-like, shape = [n_samples, n_features]

Data.

y : array-like, shape = [n_samples]

Multi-class targets.

Returns:

self :

get_params(deep=True)[source]

Get parameters for this estimator.

Parameters:

deep: boolean, optional :

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params : mapping of string to any

Parameter names mapped to their values.

predict(X)[source]

Estimate the best class label for each sample in X.

This is implemented as argmax(decision_function(X), axis=1) which will return the label of the class with most votes by estimators predicting the outcome of a decision for each possible class pair.

Parameters:

X : (sparse) array-like, shape = [n_samples, n_features]

Data.

Returns:

y : numpy array of shape [n_samples]

Predicted multi-class targets.

score(X, y, sample_weight=None)[source]

Returns the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters:

X : array-like, shape = (n_samples, n_features)

Test samples.

y : array-like, shape = (n_samples) or (n_samples, n_outputs)

True labels for X.

sample_weight : array-like, shape = [n_samples], optional

Sample weights.

Returns:

score : float

Mean accuracy of self.predict(X) wrt. y.

set_params(**params)[source]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The former have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns:self :