sklearn.cross_validation.permutation_test_score

Warning

DEPRECATED

sklearn.cross_validation.permutation_test_score(estimator, X, y, cv=None, n_permutations=100, n_jobs=1, labels=None, random_state=0, verbose=0, scoring=None)[source]

Evaluate the significance of a cross-validated score with permutations

Deprecated since version 0.18: This module will be removed in 0.20. Use sklearn.model_selection.permutation_test_score instead.

Read more in the User Guide.

Parameters:

estimator : estimator object implementing ‘fit’

The object to use to fit the data.

X : array-like of shape at least 2D

The data to fit.

y : array-like

The target variable to try to predict in the case of supervised learning.

scoring : string, callable or None, optional, default: None

A string (see model evaluation documentation) or a scorer callable object / function with signature scorer(estimator, X, y).

cv : int, cross-validation generator or an iterable, optional

Determines the cross-validation splitting strategy. Possible inputs for cv are:

  • None, to use the default 3-fold cross-validation,
  • integer, to specify the number of folds.
  • An object to be used as a cross-validation generator.
  • An iterable yielding train/test splits.

For integer/None inputs, if the estimator is a classifier and y is either binary or multiclass, StratifiedKFold is used. In all other cases, KFold is used.

Refer User Guide for the various cross-validation strategies that can be used here.

n_permutations : integer, optional

Number of times to permute y.

n_jobs : integer, optional

The number of CPUs to use to do the computation. -1 means ‘all CPUs’.

labels : array-like of shape [n_samples] (optional)

Labels constrain the permutation among groups of samples with a same label.

random_state : int, RandomState instance or None, optional (default=0)

If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

verbose : integer, optional

The verbosity level.

Returns:

score : float

The true score without permuting targets.

permutation_scores : array, shape (n_permutations,)

The scores obtained for each permutations.

pvalue : float

The p-value, which approximates the probability that the score would be obtained by chance. This is calculated as:

(C + 1) / (n_permutations + 1)

Where C is the number of permutations whose score >= the true score.

The best possible p-value is 1/(n_permutations + 1), the worst is 1.0.

Notes

This function implements Test 1 in:

Ojala and Garriga. Permutation Tests for Studying Classifier Performance. The Journal of Machine Learning Research (2010) vol. 11