Version 0.23.1

May 18 2020

Changelog

sklearn.cluster

Miscellaneous

  • Fix Fixed a bug in the repr of third-party estimators that use a **kwargs parameter in their constructor, when changed_only is True which is now the default. #17205 by Nicolas Hug.

Version 0.23.0

May 12 2020

For a short description of the main highlights of the release, please refer to Release Highlights for scikit-learn 0.23.

Legend for changelogs

  • Major Feature : something big that you couldn’t do before.

  • Feature : something that you couldn’t do before.

  • Efficiency : an existing feature now may not require as much computation or memory.

  • Enhancement : a miscellaneous minor improvement.

  • Fix : something that previously didn’t work as documentated – or according to reasonable expectations – should now work.

  • API Change : you will need to change your code to have the same effect in the future; or a feature will be removed in the future.

Put the changes in their relevant module.

Enforcing keyword-only arguments

In an effort to promote clear and non-ambiguous use of the library, most constructor and function parameters are now expected to be passed as keyword arguments (i.e. using the param=value syntax) instead of positional. To ease the transition, a FutureWarning is raised if a keyword-only parameter is used as positional. In version 0.25, these parameters will be strictly keyword-only, and a TypeError will be raised. #15005 by Joel Nothman, Adrin Jalali, Thomas Fan, and Nicolas Hug. See SLEP009 for more details.

Changed models

The following estimators and functions, when fit with the same data and parameters, may produce different models from the previous version. This often occurs due to changes in the modelling logic (bug fixes or enhancements), or in random sampling procedures.

Details are listed in the changelog below.

(While we are trying to better inform users by providing this information, we cannot assure that this list is complete.)

Changelog

sklearn.cluster

sklearn.compose

sklearn.datasets

sklearn.decomposition

sklearn.ensemble

sklearn.feature_extraction

sklearn.feature_selection

sklearn.gaussian_process

sklearn.impute

sklearn.linear_model

sklearn.metrics

  • Enhancement metrics.pairwise.pairwise_distances_chunked now allows its reduce_func to not have a return value, enabling in-place operations. #16397 by Joel Nothman.

  • Fix Fixed a bug in metrics.mean_squared_error to not ignore argument squared when argument multioutput='raw_values'. #16323 by Rushabh Vasani

  • Fix Fixed a bug in metrics.mutual_info_score where negative scores could be returned. #16362 by Thomas Fan.

  • Fix Fixed a bug in metrics.confusion_matrix that would raise an error when y_true and y_pred were length zero and labels was not None. In addition, we raise an error when an empty list is given to the labels parameter. #16442 by Kyle Parsons <parsons-kyle-89>.

  • API Change Changed the formatting of values in metrics.ConfusionMatrixDisplay.plot and metrics.plot_confusion_matrix to pick the shorter format (either ‘2g’ or ‘d’). #16159 by Rick Mackenbach and Thomas Fan.

  • API Change From version 0.25, metrics.pairwise.pairwise_distances will no longer automatically compute the VI parameter for Mahalanobis distance and the V parameter for seuclidean distance if Y is passed. The user will be expected to compute this parameter on the training data of their choice and pass it to pairwise_distances. #16993 by Joel Nothman.

sklearn.model_selection

sklearn.multioutput

sklearn.naive_bayes

sklearn.neural_network

sklearn.inspection

sklearn.preprocessing

sklearn.semi_supervised

sklearn.svm

  • Fix Efficiency Improved libsvm and liblinear random number generators used to randomly select coordinates in the coordinate descent algorithms. Platform-dependent C rand() was used, which is only able to generate numbers up to 32767 on windows platform (see this blog post) and also has poor randomization power as suggested by this presentation. It was replaced with C++11 mt19937, a Mersenne Twister that correctly generates 31bits/63bits random numbers on all platforms. In addition, the crude “modulo” postprocessor used to get a random number in a bounded interval was replaced by the tweaked Lemire method as suggested by this blog post. Any model using the svm.libsvm or the svm.liblinear solver, including svm.LinearSVC, svm.LinearSVR, svm.NuSVC, svm.NuSVR, svm.OneClassSVM, svm.SVC, svm.SVR, linear_model.LogisticRegression, is affected. In particular users can expect a better convergence when the number of samples (LibSVM) or the number of features (LibLinear) is large. #13511 by Sylvain Marié.

  • Fix Fix use of custom kernel not taking float entries such as string kernels in svm.SVC and svm.SVR. Note that custom kennels are now expected to validate their input where they previously received valid numeric arrays. #11296 by Alexandre Gramfort and Georgi Peev.

  • API Change svm.SVR and svm.OneClassSVM attributes, probA_ and probB_, are now deprecated as they were not useful. #15558 by Thomas Fan.

sklearn.tree

sklearn.utils

Miscellaneous

  • Major Feature Adds a HTML representation of estimators to be shown in a jupyter notebook or lab. This visualization is acitivated by setting the display option in sklearn.set_config. #14180 by Thomas Fan.

  • Enhancement scikit-learn now works with mypy without errors. #16726 by Roman Yurchak.

  • API Change Most estimators now expose a n_features_in_ attribute. This attribute is equal to the number of features passed to the fit method. See SLEP010 for details. #16112 by Nicolas Hug.

  • API Change Estimators now have a requires_y tags which is False by default except for estimators that inherit from ~sklearn.base.RegressorMixin or ~sklearn.base.ClassifierMixin. This tag is used to ensure that a proper error message is raised when y was expected but None was passed. #16622 by Nicolas Hug.

  • API Change The default setting print_changed_only has been changed from False to True. This means that the repr of estimators is now more concise and only shows the parameters whose default value has been changed when printing an estimator. You can restore the previous behaviour by using sklearn.set_config(print_changed_only=False). Also, note that it is always possible to quickly inspect the parameters of any estimator using est.get_params(deep=False). #17061 by Nicolas Hug.

Code and Documentation Contributors

Thanks to everyone who has contributed to the maintenance and improvement of the project since version 0.22, including:

Abbie Popa, Adrin Jalali, Aleksandra Kocot, Alexandre Batisse, Alexandre Gramfort, Alex Henrie, Alex Itkes, Alex Liang, alexshacked, Alonso Silva Allende, Ana Casado, Andreas Mueller, Angela Ambroz, Ankit810, Arie Pratama Sutiono, Arunav Konwar, Baptiste Maingret, Benjamin Beier Liu, bernie gray, Bharathi Srinivasan, Bharat Raghunathan, Bibhash Chandra Mitra, Brian Wignall, brigi, Brigitta Sipőcz, Carlos H Brandt, CastaChick, castor, cgsavard, Chiara Marmo, Chris Gregory, Christian Kastner, Christian Lorentzen, Corrie Bartelheimer, Daniël van Gelder, Daphne, David Breuer, david-cortes, dbauer9, Divyaprabha M, Edward Qian, Ekaterina Borovikova, ELNS, Emily Taylor, Erich Schubert, Eric Leung, Evgeni Chasnovski, Fabiana, Facundo Ferrín, Fan, Franziska Boenisch, Gael Varoquaux, Gaurav Sharma, Geoffrey Bolmier, Georgi Peev, gholdman1, Gonthier Nicolas, Gregory Morse, Gregory R. Lee, Guillaume Lemaitre, Gui Miotto, Hailey Nguyen, Hanmin Qin, Hao Chun Chang, HaoYin, Hélion du Mas des Bourboux, Himanshu Garg, Hirofumi Suzuki, huangk10, Hugo van Kemenade, Hye Sung Jung, indecisiveuser, inderjeet, J-A16, Jérémie du Boisberranger, Jin-Hwan CHO, JJmistry, Joel Nothman, Johann Faouzi, Jon Haitz Legarreta Gorroño, Juan Carlos Alfaro Jiménez, judithabk6, jumon, Kathryn Poole, Katrina Ni, Kesshi Jordan, Kevin Loftis, Kevin Markham, krishnachaitanya9, Lam Gia Thuan, Leland McInnes, Lisa Schwetlick, lkubin, Loic Esteve, lopusz, lrjball, lucgiffon, lucyleeow, Lucy Liu, Lukas Kemkes, Maciej J Mikulski, Madhura Jayaratne, Magda Zielinska, maikia, Mandy Gu, Manimaran, Manish Aradwad, Maren Westermann, Maria, Mariana Meireles, Marie Douriez, Marielle, Mateusz Górski, mathurinm, Matt Hall, Maura Pintor, mc4229, meyer89, m.fab, Michael Shoemaker, Michał Słapek, Mina Naghshhnejad, mo, Mohamed Maskani, Mojca Bertoncelj, narendramukherjee, ngshya, Nicholas Won, Nicolas Hug, nicolasservel, Niklas, @nkish, Noa Tamir, Oleksandr Pavlyk, olicairns, Oliver Urs Lenz, Olivier Grisel, parsons-kyle-89, Paula, Pete Green, Pierre Delanoue, pspachtholz, Pulkit Mehta, Qizhi Jiang, Quang Nguyen, rachelcjordan, raduspaimoc, Reshama Shaikh, Riccardo Folloni, Rick Mackenbach, Ritchie Ng, Roman Feldbauer, Roman Yurchak, Rory Hartong-Redden, Rüdiger Busche, Rushabh Vasani, Sambhav Kothari, Samesh Lakhotia, Samuel Duan, SanthoshBala18, Santiago M. Mola, Sarat Addepalli, scibol, Sebastian Kießling, SergioDSR, Sergul Aydore, Shiki-H, shivamgargsya, SHUBH CHATTERJEE, Siddharth Gupta, simonamaggio, smarie, Snowhite, stareh, Stephen Blystone, Stephen Marsh, Sunmi Yoon, SylvainLan, talgatomarov, tamirlan1, th0rwas, theoptips, Thomas J Fan, Thomas Li, Thomas Schmitt, Tim Nonner, Tim Vink, Tiphaine Viard, Tirth Patel, Titus Christian, Tom Dupré la Tour, trimeta, Vachan D A, Vandana Iyer, Venkatachalam N, waelbenamara, wconnell, wderose, wenliwyan, Windber, wornbb, Yu-Hang “Maxin” Tang