# Version 1.2.0¶

In Development

## Legend for changelogs¶

• Major Feature : something big that you couldn’t do before.

• Feature : something that you couldn’t do before.

• Efficiency : an existing feature now may not require as much computation or memory.

• Enhancement : a miscellaneous minor improvement.

• Fix : something that previously didn’t work as documentated – or according to reasonable expectations – should now work.

• API Change : you will need to change your code to have the same effect in the future; or a feature will be removed in the future.

## Changed models¶

The following estimators and functions, when fit with the same data and parameters, may produce different models from the previous version. This often occurs due to changes in the modelling logic (bug fixes or enhancements), or in random sampling procedures.

## Changes impacting all modules¶

• Enhancement Finiteness checks (detection of NaN and infinite values) in all estimators are now significantly more efficient for float32 data by leveraging NumPy’s SIMD optimized primitives. #23446 by Meekail Zain

• Enhancement Finiteness checks (detection of NaN and infinite values) in all estimators are now faster by utilizing a more efficient stop-on-first second-pass algorithm. #23197 by Meekail Zain

• Enhancement Support for combinations of dense and sparse datasets pairs for all distance metrics and for float32 and float64 datasets has been added or has seen its performance improved for the following estimators:

## Changelog¶

### sklearn.calibration¶

• API Change Rename base_estimator to estimator in CalibratedClassifierCV to improve readability and consistency. The parameter base_estimator is deprecated and will be removed in 1.4. #22054 by Kevin Roice.

• Efficiency Low-level routines for reductions on pairwise distances for dense float32 datasets have been refactored. The following functions and estimators now benefit from improved performances in terms of hardware scalability and speed-ups:

For instance sklearn.neighbors.NearestNeighbors.kneighbors and sklearn.neighbors.NearestNeighbors.radius_neighbors can respectively be up to ×20 and ×5 faster than previously on a laptop.

Moreover, implementations of those two algorithms are now suitable for machine with many cores, making them usable for datasets consisting of millions of samples.

### sklearn.datasets¶

• Enhancement Introduce the new parameter parser in datasets.fetch_openml. parser="pandas" allows to use the very CPU and memory efficient pandas.read_csv parser to load dense ARFF formatted dataset files. It is possible to pass parser="liac-arff" to use the old LIAC parser. When parser="auto", dense datasets are loaded with “pandas” and sparse datasets are loaded with “liac-arff”. Currently, parser="liac-arff" by default and will change to parser="auto" in version 1.4 #21938 by Guillaume Lemaitre.

• Enhancement datasets.dump_svmlight_file is now accelerated with a Cython implementation, providing 2-4x speedups. #23127 by Meekail Zain

### sklearn.model_selection¶

• Fix For all SearchCV classes and scipy >= 1.10, rank corresponding to a nan score is correctly set to the maximum possible rank, rather than np.iinfo(np.int32).min. #24141 by Loïc Estève.

## Code and Documentation Contributors¶

Thanks to everyone who has contributed to the maintenance and improvement of the project since version 1.1, including:

TODO: update at the time of the release.