Version 0.21.3

July 30, 2019

Changed models

The following estimators and functions, when fit with the same data and parameters, may produce different models from the previous version. This often occurs due to changes in the modelling logic (bug fixes or enhancements), or in random sampling procedures.

  • The v0.20.0 release notes failed to mention a backwards incompatibility in metrics.make_scorer when needs_proba=True and y_true is binary. Now, the scorer function is supposed to accept a 1D y_pred (i.e., probability of the positive class, shape (n_samples,)), instead of a 2D y_pred (i.e., shape (n_samples, 2)).

Changelog

sklearn.cluster

sklearn.compose

  • Fix Fixed an issue in compose.ColumnTransformer where using DataFrames whose column order differs between :func:fit and :func:transform could lead to silently passing incorrect columns to the remainder transformer. #14237 by Andreas Schuderer.

sklearn.datasets

sklearn.ensemble

  • Fix Fix zero division error in HistGradientBoostingClassifier and HistGradientBoostingRegressor. #14024 by Nicolas Hug.

sklearn.impute

sklearn.inspection

sklearn.linear_model

sklearn.neighbors

sklearn.tree

  • Fix Fixed bug in tree.export_text when the tree has one feature and a single feature name is passed in. #14053 by Thomas Fan.
  • Fix Fixed an issue with plot_tree where it displayed entropy calculations even for gini criterion in DecisionTreeClassifiers. #13947 by Frank Hoang.

Version 0.21.2

24 May 2019

Changelog

sklearn.decomposition

sklearn.metrics

sklearn.preprocessing

sklearn.utils.sparsefuncs

Version 0.21.1

17 May 2019

This is a bug-fix release to primarily resolve some packaging issues in version 0.21.0. It also includes minor documentation improvements and some bug fixes.

Changelog

sklearn.inspection

sklearn.metrics

sklearn.neighbors

Version 0.21.0

May 2019

Changed models

The following estimators and functions, when fit with the same data and parameters, may produce different models from the previous version. This often occurs due to changes in the modelling logic (bug fixes or enhancements), or in random sampling procedures.

Details are listed in the changelog below.

(While we are trying to better inform users by providing this information, we cannot assure that this list is complete.)

Known Major Bugs

  • The default max_iter for linear_model.LogisticRegression is too small for many solvers given the default tol. In particular, we accidentally changed the default max_iter for the liblinear solver from 1000 to 100 iterations in #3591 released in version 0.16. In a future release we hope to choose better default max_iter and tol heuristically depending on the solver (see #13317).

Changelog

Support for Python 3.4 and below has been officially dropped.

sklearn.base

sklearn.calibration

sklearn.cluster

sklearn.compose

sklearn.datasets

sklearn.decomposition

sklearn.discriminant_analysis

sklearn.dummy

sklearn.ensemble

sklearn.externals

  • API Change Deprecated externals.six since we have dropped support for Python 2.7. #12916 by Hanmin Qin.

sklearn.feature_extraction

sklearn.impute

  • Major Feature Added impute.IterativeImputer, which is a strategy for imputing missing values by modeling each feature with missing values as a function of other features in a round-robin fashion. #8478 and #12177 by Sergey Feldman and Ben Lawson.

    The API of IterativeImputer is experimental and subject to change without any deprecation cycle. To use them, you need to explicitly import enable_iterative_imputer:

    >>> from sklearn.experimental import enable_iterative_imputer  # noqa
    >>> # now you can import normally from sklearn.impute
    >>> from sklearn.impute import IterativeImputer
    
  • Feature The impute.SimpleImputer and impute.IterativeImputer have a new parameter 'add_indicator', which simply stacks a impute.MissingIndicator transform into the output of the imputer’s transform. That allows a predictive estimator to account for missingness. #12583, #13601 by Danylo Baibak.

  • Fix In impute.MissingIndicator avoid implicit densification by raising an exception if input is sparse add missing_values property is set to 0. #13240 by Bartosz Telenczuk.

  • Fix Fixed two bugs in impute.MissingIndicator. First, when X is sparse, all the non-zero non missing values used to become explicit False in the transformed data. Then, when features='missing-only', all features used to be kept if there were no missing values at all. #13562 by Jérémie du Boisberranger.

sklearn.inspection

(new subpackage)

sklearn.isotonic

sklearn.linear_model

sklearn.manifold

  • Efficiency Make manifold.tsne.trustworthiness use an inverted index instead of an np.where lookup to find the rank of neighbors in the input space. This improves efficiency in particular when computed with lots of neighbors and/or small datasets. #9907 by William de Vazelhes.

sklearn.metrics

sklearn.mixture

sklearn.model_selection

sklearn.multiclass

sklearn.multioutput

sklearn.neighbors

sklearn.neural_network

sklearn.pipeline

sklearn.preprocessing

sklearn.svm

  • Fix Fixed an issue in svm.SVC.decision_function when decision_function_shape='ovr'. The decision_function value of a given sample was different depending on whether the decision_function was evaluated on the sample alone or on a batch containing this same sample due to the scaling used in decision_function. #10440 by Jonathan Ohayon.

sklearn.tree

sklearn.utils

Multiple modules

  • Major Feature The __repr__() method of all estimators (used when calling print(estimator)) has been entirely re-written, building on Python’s pretty printing standard library. All parameters are printed by default, but this can be altered with the print_changed_only option in sklearn.set_config. #11705 by Nicolas Hug.
  • Major Feature Add estimators tags: these are annotations of estimators that allow programmatic inspection of their capabilities, such as sparse matrix support, supported output types and supported methods. Estimator tags also determine the tests that are run on an estimator when check_estimator is called. Read more in the User Guide. #8022 by Andreas Müller.
  • Efficiency Memory copies are avoided when casting arrays to a different dtype in multiple estimators. #11973 by Roman Yurchak.
  • Fix Fixed a bug in the implementation of the our_rand_r helper function that was not behaving consistently across platforms. #13422 by Madhura Parikh and Clément Doumouro.

Miscellaneous

  • Enhancement Joblib is no longer vendored in scikit-learn, and becomes a dependency. Minimal supported version is joblib 0.11, however using version >= 0.13 is strongly recommended. #13531 by Roman Yurchak.

Changes to estimator checks

These changes mostly affect library developers.

Code and Documentation Contributors

Thanks to everyone who has contributed to the maintenance and improvement of the project since version 0.20, including:

adanhawth, Aditya Vyas, Adrin Jalali, Agamemnon Krasoulis, Albert Thomas, Alberto Torres, Alexandre Gramfort, amourav, Andrea Navarrete, Andreas Mueller, Andrew Nystrom, assiaben, Aurélien Bellet, Bartosz Michałowski, Bartosz Telenczuk, bauks, BenjaStudio, bertrandhaut, Bharat Raghunathan, brentfagan, Bryan Woods, Cat Chenal, Cheuk Ting Ho, Chris Choe, Christos Aridas, Clément Doumouro, Cole Smith, Connossor, Corey Levinson, Dan Ellis, Dan Stine, Danylo Baibak, daten-kieker, Denis Kataev, Didi Bar-Zev, Dillon Gardner, Dmitry Mottl, Dmitry Vukolov, Dougal J. Sutherland, Dowon, drewmjohnston, Dror Atariah, Edward J Brown, Ekaterina Krivich, Elizabeth Sander, Emmanuel Arias, Eric Chang, Eric Larson, Erich Schubert, esvhd, Falak, Feda Curic, Federico Caselli, Frank Hoang, Fibinse Xavier`, Finn O’Shea, Gabriel Marzinotto, Gabriel Vacaliuc, Gabriele Calvo, Gael Varoquaux, GauravAhlawat, Giuseppe Vettigli, Greg Gandenberger, Guillaume Fournier, Guillaume Lemaitre, Gustavo De Mari Pereira, Hanmin Qin, haroldfox, hhu-luqi, Hunter McGushion, Ian Sanders, JackLangerman, Jacopo Notarstefano, jakirkham, James Bourbeau, Jan Koch, Jan S, janvanrijn, Jarrod Millman, jdethurens, jeremiedbb, JF, joaak, Joan Massich, Joel Nothman, Jonathan Ohayon, Joris Van den Bossche, josephsalmon, Jérémie Méhault, Katrin Leinweber, ken, kms15, Koen, Kossori Aruku, Krishna Sangeeth, Kuai Yu, Kulbear, Kushal Chauhan, Kyle Jackson, Lakshya KD, Leandro Hermida, Lee Yi Jie Joel, Lily Xiong, Lisa Sarah Thomas, Loic Esteve, louib, luk-f-a, maikia, mail-liam, Manimaran, Manuel López-Ibáñez, Marc Torrellas, Marco Gaido, Marco Gorelli, MarcoGorelli, marineLM, Mark Hannel, Martin Gubri, Masstran, mathurinm, Matthew Roeschke, Max Copeland, melsyt, mferrari3, Mickaël Schoentgen, Ming Li, Mitar, Mohammad Aftab, Mohammed AbdelAal, Mohammed Ibraheem, Muhammad Hassaan Rafique, mwestt, Naoya Iijima, Nicholas Smith, Nicolas Goix, Nicolas Hug, Nikolay Shebanov, Oleksandr Pavlyk, Oliver Rausch, Olivier Grisel, Orestis, Osman, Owen Flanagan, Paul Paczuski, Pavel Soriano, pavlos kallis, Pawel Sendyk, peay, Peter, Peter Cock, Peter Hausamann, Peter Marko, Pierre Glaser, pierretallotte, Pim de Haan, Piotr Szymański, Prabakaran Kumaresshan, Pradeep Reddy Raamana, Prathmesh Savale, Pulkit Maloo, Quentin Batista, Radostin Stoyanov, Raf Baluyot, Rajdeep Dua, Ramil Nugmanov, Raúl García Calvo, Rebekah Kim, Reshama Shaikh, Rohan Lekhwani, Rohan Singh, Rohan Varma, Rohit Kapoor, Roman Feldbauer, Roman Yurchak, Romuald M, Roopam Sharma, Ryan, Rüdiger Busche, Sam Waterbury, Samuel O. Ronsin, SandroCasagrande, Scott Cole, Scott Lowe, Sebastian Raschka, Shangwu Yao, Shivam Kotwalia, Shiyu Duan, smarie, Sriharsha Hatwar, Stephen Hoover, Stephen Tierney, Stéphane Couvreur, surgan12, SylvainLan, TakingItCasual, Tashay Green, thibsej, Thomas Fan, Thomas J Fan, Thomas Moreau, Tom Dupré la Tour, Tommy, Tulio Casagrande, Umar Farouk Umar, Utkarsh Upadhyay, Vinayak Mehta, Vishaal Kapoor, Vivek Kumar, Vlad Niculae, vqean3, Wenhao Zhang, William de Vazelhes, xhan, Xing Han Lu, xinyuliu12, Yaroslav Halchenko, Zach Griffith, Zach Miller, Zayd Hammoudeh, Zhuyi Xue, Zijie (ZJ) Poh, ^__^