matthews_corrcoef#

sklearn.metrics.matthews_corrcoef(y_true, y_pred, *, sample_weight=None)[source]#

Compute the Matthews correlation coefficient (MCC).

The Matthews correlation coefficient is used in machine learning as a measure of the quality of binary and multiclass classifications. It takes into account true and false positives and negatives and is generally regarded as a balanced measure which can be used even if the classes are of very different sizes. The MCC is in essence a correlation coefficient value between -1 and +1. A coefficient of +1 represents a perfect prediction, 0 an average random prediction and -1 an inverse prediction. The statistic is also known as the phi coefficient. [source: Wikipedia]

Binary and multiclass labels are supported. Only in the binary case does this relate to information about true and false positives and negatives. See references below.

Read more in the User Guide.

Parameters:

y_truearray-like of shape (n_samples,): Ground truth (correct) target values.
y_predarray-like of shape (n_samples,): Estimated targets as returned by a classifier.
sample_weightarray-like of shape (n_samples,), default=None: Sample weights.

Added in version 0.18.

Returns:

mccfloat: The Matthews correlation coefficient (+1 represents a perfect prediction, 0 an average random prediction and -1 and inverse prediction).

References

[1]

Baldi, Brunak, Chauvin, Andersen and Nielsen, (2000). Assessing the accuracy of prediction algorithms for classification: an overview.

[2]

Wikipedia entry for the Matthews Correlation Coefficient (phi coefficient).

[3]

Gorodkin, (2004). Comparing two K-category assignments by a K-category correlation coefficient.

[4]

Jurman, Riccadonna, Furlanello, (2012). A Comparison of MCC and CEN Error Measures in MultiClass Prediction.

Examples

>>> from sklearn.metrics import matthews_corrcoef
>>> y_true = [+1, +1, +1, -1]
>>> y_pred = [+1, -1, +1, +1]
>>> matthews_corrcoef(y_true, y_pred)
-0.33