# Release History¶

Release notes for current and recent releases are detailed on this page, with previous releases linked below.

Tip: Subscribe to scikit-learn releases on libraries.io to be notified when new versions are released.

# Version 0.21.0¶

In development

## Changed models¶

The following estimators and functions, when fit with the same data and parameters, may produce different models from the previous version. This often occurs due to changes in the modelling logic (bug fixes or enhancements), or in random sampling procedures.

Details are listed in the changelog below.

(While we are trying to better inform users by providing this information, we cannot assure that this list is complete.)

## Changelog¶

Support for Python 3.4 and below has been officially dropped.

### sklearn.datasets¶

• Fix Added support for 64-bit group IDs and pointers in SVMLight files datasets.svmlight_format #10727 by Bryan K Woods.

### sklearn.decomposition¶

• API Change The default value of the init argument in decomposition.non_negative_factorization will change from random to None in version 0.23 to make it consistent with decomposition.NMF. A FutureWarning is raised when the default value is used. #12988 by Zijie (ZJ) Poh.

### sklearn.ensemble¶

• Efficiency Make ensemble.IsolationForest prefer threads over processes when running with n_jobs > 1 as the underlying decision tree fit calls do release the GIL. This changes reduces memory usage and communication overhead. #12543 by Isaac Storch and Olivier Grisel.
• Fix Fixed a bug in ensemble.GradientBoostingClassifier where the gradients would be incorrectly computed in multiclass classification problems. #12715 by Nicolas Hug.
• Fix Fixed a bug in ensemble where the predict method would error for multiclass multioutput forests models if any targets were strings. #12834 by Elizabeth Sander.
• Fix Fixed a bug in ensemble.gradient_boosting.LossFunction and ensemble.gradient_boosting.LeastSquaresError where the default value of learning_rate in update_terminal_regions is not consistent with the document and the caller functions. #6463 by movelikeriver.

### sklearn.externals¶

• API Change Deprecated externals.six since we have dropped support for Python 2.7. #12916 by Hanmin Qin.

### sklearn.manifold¶

• Efficiency Make manifold.tsne.trustworthiness use an inverted index instead of an np.where lookup to find the rank of neighbors in the input space. This improves efficiency in particular when computed with lots of neighbors and/or small datasets. #9907 by William de Vazelhes.

### Multiple modules¶

• The __repr__() method of all estimators (used when calling print(estimator)) has been entirely re-written, building on Python’s pretty printing standard library. All parameters are printed by default, but this can be altered with the print_changed_only option in sklearn.set_config. #11705 by Nicolas Hug.

## Changes to estimator checks¶

These changes mostly affect library developers.

# Version 0.20.3¶

??, 2019

This is a bug-fix release with some minor documentation improvements and enhancements to features released in 0.20.0.

# Version 0.20.2¶

December 20, 2018

This is a bug-fix release with some minor documentation improvements and enhancements to features released in 0.20.0.

## Changed models¶

The following estimators and functions, when fit with the same data and parameters, may produce different models from the previous version. This often occurs due to changes in the modelling logic (bug fixes or enhancements), or in random sampling procedures.

• sklearn.neighbors when metric=='jaccard' (bug fix)
• use of 'seuclidean' or 'mahalanobis' metrics in some cases (bug fix)

## Code and Documentation Contributors¶

With thanks to:

adanhawth, Adrin Jalali, Albert Thomas, Andreas Mueller, Dan Stine, Feda Curic, Hanmin Qin, Jan S, jeremiedbb, Joel Nothman, Joris Van den Bossche, josephsalmon, Katrin Leinweber, Loic Esteve, Muhammad Hassaan Rafique, Nicolas Hug, Olivier Grisel, Paul Paczuski, Reshama Shaikh, Sam Waterbury, Shivam Kotwalia, Thomas Fan

# Version 0.20.1¶

November 21, 2018

This is a bug-fix release with some minor documentation improvements and enhancements to features released in 0.20.0. Note that we also include some API changes in this release, so you might get some extra warnings after updating from 0.20.0 to 0.20.1.

## Changed models¶

The following estimators and functions, when fit with the same data and parameters, may produce different models from the previous version. This often occurs due to changes in the modelling logic (bug fixes or enhancements), or in random sampling procedures.

## Changelog¶

### Miscellaneous¶

• Fix When using site joblib by setting the environment variable SKLEARN_SITE_JOBLIB, added compatibility with joblib 0.11 in addition to 0.12+. #12350 by Joel Nothman and Roman Yurchak.
• Fix Make sure to avoid raising FutureWarning when calling np.vstack with numpy 1.16 and later (use list comprehensions instead of generator expressions in many locations of the scikit-learn code base). #12467 by Olivier Grisel.
• API Change Removed all mentions of sklearn.externals.joblib, and deprecated joblib methods exposed in sklearn.utils, except for utils.parallel_backend and utils.register_parallel_backend, which allow users to configure parallel computation in scikit-learn. Other functionalities are part of joblib. package and should be used directly, by installing it. The goal of this change is to prepare for unvendoring joblib in future version of scikit-learn. #12345 by Thomas Moreau

## Code and Documentation Contributors¶

With thanks to:

^__^, Adrin Jalali, Andrea Navarrete, Andreas Mueller, bauks, BenjaStudio, Cheuk Ting Ho, Connossor, Corey Levinson, Dan Stine, daten-kieker, Denis Kataev, Dillon Gardner, Dmitry Vukolov, Dougal J. Sutherland, Edward J Brown, Eric Chang, Federico Caselli, Gabriel Marzinotto, Gael Varoquaux, GauravAhlawat, Gustavo De Mari Pereira, Hanmin Qin, haroldfox, JackLangerman, Jacopo Notarstefano, janvanrijn, jdethurens, jeremiedbb, Joel Nothman, Joris Van den Bossche, Koen, Kushal Chauhan, Lee Yi Jie Joel, Lily Xiong, mail-liam, Mark Hannel, melsyt, Ming Li, Nicholas Smith, Nicolas Hug, Nikolay Shebanov, Oleksandr Pavlyk, Olivier Grisel, Peter Hausamann, Pierre Glaser, Pulkit Maloo, Quentin Batista, Radostin Stoyanov, Ramil Nugmanov, Rebekah Kim, Reshama Shaikh, Rohan Singh, Roman Feldbauer, Roman Yurchak, Roopam Sharma, Sam Waterbury, Scott Lowe, Sebastian Raschka, Stephen Tierney, SylvainLan, TakingItCasual, Thomas Fan, Thomas Moreau, Tom Dupré la Tour, Tulio Casagrande, Utkarsh Upadhyay, Xing Han Lu, Yaroslav Halchenko, Zach Miller

# Version 0.20.0¶

September 25, 2018

This release packs in a mountain of bug fixes, features and enhancements for the Scikit-learn library, and improvements to the documentation and examples. Thanks to our contributors!

This release is dedicated to the memory of Raghav Rajagopalan.

Warning

Version 0.20 is the last version of scikit-learn to support Python 2.7 and Python 3.4. Scikit-learn 0.21 will require Python 3.5 or higher.

## Highlights¶

We have tried to improve our support for common data-science use-cases including missing values, categorical variables, heterogeneous data, and features/targets with unusual distributions. Missing values in features, represented by NaNs, are now accepted in column-wise preprocessing such as scalers. Each feature is fitted disregarding NaNs, and data containing NaNs can be transformed. The new impute module provides estimators for learning despite missing data.

ColumnTransformer handles the case where different features or columns of a pandas.DataFrame need different preprocessing. String or pandas Categorical columns can now be encoded with OneHotEncoder or OrdinalEncoder.

TransformedTargetRegressor helps when the regression target needs to be transformed to be modeled. PowerTransformer and KBinsDiscretizer join QuantileTransformer as non-linear transformations.

Beyond this, we have added sample_weight support to several estimators (including KMeans, BayesianRidge and KernelDensity) and improved stopping criteria in others (including MLPRegressor, GradientBoostingRegressor and SGDRegressor).

This release is also the first to be accompanied by a Glossary of Common Terms and API Elements developed by Joel Nothman. The glossary is a reference resource to help users and contributors become familiar with the terminology and conventions used in Scikit-learn.

Sorry if your contribution didn’t make it into the highlights. There’s a lot here…

## Changed models¶

The following estimators and functions, when fit with the same data and parameters, may produce different models from the previous version. This often occurs due to changes in the modelling logic (bug fixes or enhancements), or in random sampling procedures.

Details are listed in the changelog below.

(While we are trying to better inform users by providing this information, we cannot assure that this list is complete.)

## Known Major Bugs¶

• #11924: linear_model.LogisticRegressionCV with solver='lbfgs' and multi_class='multinomial' may be non-deterministic or otherwise broken on macOS. This appears to be the case on Travis CI servers, but has not been confirmed on personal MacBooks! This issue has been present in previous releases.
• #9354: metrics.pairwise.euclidean_distances (which is used several times throughout the library) gives results with poor precision, which particularly affects its use with 32-bit float inputs. This became more problematic in versions 0.18 and 0.19 when some algorithms were changed to avoid casting 32-bit data into 64-bit.

## Changelog¶

Support for Python 3.3 has been officially dropped.

### sklearn.discriminant_analysis¶

• Efficiency Memory usage improvement for _class_means and _class_cov in discriminant_analysis. #10898 by Nanxin Chen.

### sklearn.manifold¶

• Efficiency Speed improvements for both ‘exact’ and ‘barnes_hut’ methods in manifold.TSNE. #10593 and #10610 by Tom Dupre la Tour.
• Feature Support sparse input in manifold.Isomap.fit. #8554 by Leland McInnes.
• Feature manifold.t_sne.trustworthiness accepts metrics other than Euclidean. #9775 by William de Vazelhes.
• Fix Fixed a bug in manifold.spectral_embedding where the normalization of the spectrum was using a division instead of a multiplication. #8129 by Jan Margeta, Guillaume Lemaitre, and Devansh D..
• API Change Feature Deprecate precomputed parameter in function manifold.t_sne.trustworthiness. Instead, the new parameter metric should be used with any compatible metric including ‘precomputed’, in which case the input matrix X should be a matrix of pairwise distances or squared distances. #9775 by William de Vazelhes.
• API Change Deprecate precomputed parameter in function manifold.t_sne.trustworthiness. Instead, the new parameter metric should be used with any compatible metric including ‘precomputed’, in which case the input matrix X should be a matrix of pairwise distances or squared distances. #9775 by William de Vazelhes.

### sklearn.tree¶

• Enhancement Although private (and hence not assured API stability), tree._criterion.ClassificationCriterion and tree._criterion.RegressionCriterion may now be cimported and extended. #10325 by Camil Staps.
• Fix Fixed a bug in tree.BaseDecisionTree with splitter="best" where split threshold could become infinite when values in X were near infinite. #10536 by Jonathan Ohayon.
• Fix Fixed a bug in tree.MAE to ensure sample weights are being used during the calculation of tree MAE impurity. Previous behaviour could cause suboptimal splits to be chosen since the impurity calculation considered all samples to be of equal weight importance. #11464 by John Stott.

## Changes to estimator checks¶

These changes mostly affect library developers.

## Code and Documentation Contributors¶

Thanks to everyone who has contributed to the maintenance and improvement of the project since version 0.19, including: