Fork me on GitHub


sklearn.metrics.fbeta_score(y_true, y_pred, beta, labels=None, pos_label=1, average='weighted', sample_weight=None)

Compute the F-beta score

The F-beta score is the weighted harmonic mean of precision and recall, reaching its optimal value at 1 and its worst value at 0.

The beta parameter determines the weight of precision in the combined score. beta < 1 lends more weight to precision, while beta > 1 favors recall (beta -> 0 considers only precision, beta -> inf only recall).


y_true : array-like or label indicator matrix

Ground truth (correct) target values.

y_pred : array-like or label indicator matrix

Estimated targets as returned by a classifier.

beta: float :

Weight of precision in harmonic mean.

labels : array

Integer array of labels.

pos_label : str or int, 1 by default

If average is not None and the classification target is binary, only this class’s scores will be returned.

average : string, [None, ‘micro’, ‘macro’, ‘samples’, ‘weighted’ (default)]

If None, the scores for each class are returned. Otherwise, unless pos_label is given in binary classification, this determines the type of averaging performed on the data:


Calculate metrics globally by counting the total true positives, false negatives and false positives.


Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.


Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.


Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from accuracy_score).

sample_weight : array-like of shape = [n_samples], optional

Sample weights.


fbeta_score : float (if average is not None) or array of float, shape = [n_unique_labels]

F-beta score of the positive class in binary classification or weighted average of the F-beta score of each class for the multiclass task.


[R156]R. Baeza-Yates and B. Ribeiro-Neto (2011). Modern Information Retrieval. Addison Wesley, pp. 327-328.
[R157]Wikipedia entry for the F1-score


>>> from sklearn.metrics import fbeta_score
>>> y_true = [0, 1, 2, 0, 1, 2]
>>> y_pred = [0, 2, 1, 0, 0, 1]
>>> fbeta_score(y_true, y_pred, average='macro', beta=0.5)
>>> fbeta_score(y_true, y_pred, average='micro', beta=0.5)
>>> fbeta_score(y_true, y_pred, average='weighted', beta=0.5)
>>> fbeta_score(y_true, y_pred, average=None, beta=0.5)
array([ 0.71...,  0.        ,  0.        ])