sklearn.metrics
.cohen_kappa_score¶

sklearn.metrics.
cohen_kappa_score
(y1, y2, labels=None, weights=None, sample_weight=None)[source]¶ Cohen’s kappa: a statistic that measures interannotator agreement.
This function computes Cohen’s kappa [R236], a score that expresses the level of agreement between two annotators on a classification problem. It is defined as
where is the empirical probability of agreement on the label assigned to any sample (the observed agreement ratio), and is the expected agreement when both annotators assign labels randomly. is estimated using a perannotator empirical prior over the class labels [R237].
Read more in the User Guide.
Parameters:  y1 : array, shape = [n_samples]
Labels assigned by the first annotator.
 y2 : array, shape = [n_samples]
Labels assigned by the second annotator. The kappa statistic is symmetric, so swapping
y1
andy2
doesn’t change the value. labels : array, shape = [n_classes], optional
List of labels to index the matrix. This may be used to select a subset of labels. If None, all labels that appear at least once in
y1
ory2
are used. weights : str, optional
List of weighting type to calculate the score. None means no weighted; “linear” means linear weighted; “quadratic” means quadratic weighted.
 sample_weight : arraylike of shape = [n_samples], optional
Sample weights.
Returns:  kappa : float
The kappa statistic, which is a number between 1 and 1. The maximum value means complete agreement; zero or lower means chance agreement.
References
[R236] (1, 2) J. Cohen (1960). “A coefficient of agreement for nominal scales”. Educational and Psychological Measurement 20(1):3746. doi:10.1177/001316446002000104. [R237] (1, 2) R. Artstein and M. Poesio (2008). “Intercoder agreement for computational linguistics”. Computational Linguistics 34(4):555596. [R238] Wikipedia entry for the Cohen’s kappa.