ScoringMonitor#

class sklearn.callback.ScoringMonitor(*, scoring)[source]#

Callback that monitors a score for each iterative step of an estimator.

The specified scorer is called on the training data at each iterative step of the estimator, and the score is logged by the callback. The logs can be retrieved through the get_logs method.

Parameters:

scoringstr, callable, list, tuple, dict or None

The scoring method to use to monitor the model.

If scoring represents a single score, one can use:

a single string (see String name scorers);
a callable (see Callable scorers) that returns a single value;
None, the estimator’s default evaluation criterion is used.

If scoring represents multiple scores, one can use:

a list or tuple of unique strings;
a callable returning a dictionary where the keys are the metric names and the values are the metric scores;
a dictionary with metric names as keys and callables as values.

get_logs(select='most_recent', include_lineage=False)[source]#

Retrieve the logged scores.

Log entries are grouped by runs, which are the outermost enclosing fit calls. If the estimator this callback is registered on is wrapped in meta-estimators, a run corresponds to one fit of the outermost meta-estimator. If it is not wrapped in a meta-estimator, a run simply corresponds to a single fit of the estimator.

For a given run, the scores are logged in a ScoringMonitorLog object containing:

run_id: a unique identifier for the run;
estimator_name: the name of the (meta-)estimator of the run;
timestamp: the timestamp of the start of the run;
data: the recorded scores for the run. Each score value is associated with the context of the task for which the score was computed;
data_as_pandas: the recorded scores as a Pandas DataFrame.

See ScoringMonitorLog for more details about the structure of the recorded scores.

Parameters:

select{“all”, “most_recent”}, default=”most_recent”

Which log run to return:

"all": return the logged scores for all runs;
"most_recent": return the logged scores for the most recent run.

include_lineagebool, default=False

Whether to include lineage information of the tasks in the log.

If set to True, the log contains extra rows for each task that is an ancestor of a task for which the score was computed. These extra rows can be used to retrieve the context of all ancestor tasks of a given task for which the score was computed. For these extra rows, there are no score entries if as_frame is False, or NaN values if as_frame is True.

Returns:

logsScoringMonitorLog or list of ScoringMonitorLog: The logged scores. If select=="most_recent", returns a single ScoringMonitorLog object. Otherwise, returns the list of all run logs.

Gallery examples#

Analysis of the convergence of penalized logistic regression models

Release Highlights for scikit-learn 1.9