sklearn.metrics
.cohen_kappa_score¶

sklearn.metrics.
cohen_kappa_score
(y1, y2, labels=None, weights=None, sample_weight=None)[source]¶ Cohen’s kappa: a statistic that measures interannotator agreement.
This function computes Cohen’s kappa [1], a score that expresses the level of agreement between two annotators on a classification problem. It is defined as
\[\kappa = (p_o  p_e) / (1  p_e)\]where \(p_o\) is the empirical probability of agreement on the label assigned to any sample (the observed agreement ratio), and \(p_e\) is the expected agreement when both annotators assign labels randomly. \(p_e\) is estimated using a perannotator empirical prior over the class labels [2].
Read more in the User Guide.
Parameters:  y1 : array, shape = [n_samples]
Labels assigned by the first annotator.
 y2 : array, shape = [n_samples]
Labels assigned by the second annotator. The kappa statistic is symmetric, so swapping
y1
andy2
doesn’t change the value. labels : array, shape = [n_classes], optional
List of labels to index the matrix. This may be used to select a subset of labels. If None, all labels that appear at least once in
y1
ory2
are used. weights : str, optional
List of weighting type to calculate the score. None means no weighted; “linear” means linear weighted; “quadratic” means quadratic weighted.
 sample_weight : arraylike of shape = [n_samples], optional
Sample weights.
Returns:  kappa : float
The kappa statistic, which is a number between 1 and 1. The maximum value means complete agreement; zero or lower means chance agreement.
References
[1] (1, 2) J. Cohen (1960). “A coefficient of agreement for nominal scales”. Educational and Psychological Measurement 20(1):3746. doi:10.1177/001316446002000104. [2] (1, 2) R. Artstein and M. Poesio (2008). “Intercoder agreement for computational linguistics”. Computational Linguistics 34(4):555596. [3] Wikipedia entry for the Cohen’s kappa.