# Version 1.2.2¶

In development

The following estimators and functions, when fit with the same data and parameters, may produce different models from the previous version. This often occurs due to changes in the modelling logic (bug fixes or enhancements), or in random sampling procedures.

# Version 1.2.1¶

January 2023

## Changed models¶

The following estimators and functions, when fit with the same data and parameters, may produce different models from the previous version. This often occurs due to changes in the modelling logic (bug fixes or enhancements), or in random sampling procedures.

• Fix The fitted components in MiniBatchDictionaryLearning might differ. The online updates of the sufficient statistics now properly take the sizes of the batches into account. #25354 by Jérémie du Boisberranger.

• Fix The categories_ attribute of preprocessing.OneHotEncoder now always contains an array of objects when using predefined categories that are strings. Predefined categories encoded as bytes will no longer work with X encoded as strings. #25174 by Tim Head.

## Changes impacting all modules¶

• Fix Support pandas.Int64 dtyped y for classifiers and regressors. #25089 by Tim Head.

• Fix Remove spurious warnings for estimators internally using neighbors search methods. #25129 by Julien Jerphanion.

• Fix Fix a bug where the current configuration was ignored in estimators using n_jobs > 1. This bug was triggered for tasks dispatched by the auxillary thread of joblib as sklearn.get_config used to access an empty thread local configuration instead of the configuration visible from the thread where joblib.Parallel was first called. #25363 by Guillaume Lemaitre.

# Version 1.2.0¶

December 2022

For a short description of the main highlights of the release, please refer to Release Highlights for scikit-learn 1.2.

## Legend for changelogs¶

• Major Feature : something big that you couldn’t do before.

• Feature : something that you couldn’t do before.

• Efficiency : an existing feature now may not require as much computation or memory.

• Enhancement : a miscellaneous minor improvement.

• Fix : something that previously didn’t work as documentated – or according to reasonable expectations – should now work.

• API Change : you will need to change your code to have the same effect in the future; or a feature will be removed in the future.

## Changed models¶

The following estimators and functions, when fit with the same data and parameters, may produce different models from the previous version. This often occurs due to changes in the modelling logic (bug fixes or enhancements), or in random sampling procedures.

## Changes impacting all modules¶

• Major Feature The set_output API has been adopted by all transformers. Meta-estimators that contain transformers such as pipeline.Pipeline or compose.ColumnTransformer also define a set_output. For details, see SLEP018. #23734 and #24699 by Thomas Fan.

• Efficiency Low-level routines for reductions on pairwise distances for dense float32 datasets have been refactored. The following functions and estimators now benefit from improved performances in terms of hardware scalability and speed-ups:

For instance sklearn.neighbors.NearestNeighbors.kneighbors and sklearn.neighbors.NearestNeighbors.radius_neighbors can respectively be up to ×20 and ×5 faster than previously on a laptop.

Moreover, implementations of those two algorithms are now suitable for machine with many cores, making them usable for datasets consisting of millions of samples.

• Enhancement Finiteness checks (detection of NaN and infinite values) in all estimators are now significantly more efficient for float32 data by leveraging NumPy’s SIMD optimized primitives. #23446 by Meekail Zain

• Enhancement Finiteness checks (detection of NaN and infinite values) in all estimators are now faster by utilizing a more efficient stop-on-first second-pass algorithm. #23197 by Meekail Zain

• Enhancement Support for combinations of dense and sparse datasets pairs for all distance metrics and for float32 and float64 datasets has been added or has seen its performance improved for the following estimators:

• Fix Systematically check the sha256 digest of dataset tarballs used in code examples in the documentation. #24617 by Olivier Grisel and Thomas Fan. Thanks to Sim4n6 for the report.

## Code and Documentation Contributors¶

Thanks to everyone who has contributed to the maintenance and improvement of the project since version 1.1, including: