Version 1.0.2

In Development

Changelog

sklearn.cluster

sklearn.datasets

sklearn.decomposition

sklearn.ensemble

sklearn.feature_selection

sklearn.impute

sklearn.linear_model

sklearn.manifold

  • Fix Fixed an unnecessary error when fitting manifold.Isomap with a precomputed dense distance matrix where the neighbors graph has multiple disconnected components. #21915 by Tom Dupre la Tour.

sklearn.metrics

  • Fix All sklearn.metrics.DistanceMetric subclasses now correctly support read-only buffer attributes. This fixes a regression introduced in 1.0.0 with respect to 0.24.2. #21694 by Julien Jerphanion.

  • Fix All sklearn.metrics.MinkowskiDistance now accepts a weight parameter that makes it possible to write code that behaves consistently both with scipy 1.8 and earlier versions. In turns this means that all neighbors-based estimators (except those that use algorithm="kd_tree") now accept a weight parameter with metric="minknowski" to yield results that are always consistent with scipy.spatial.distance.cdist. #21741 by Olivier Grisel.

sklearn.multiclass

sklearn.neighbors

sklearn.preprocessing

sklearn.tree

  • Fix Prevents tree.plot_tree from drawing out of the boundary of the figure. #21917 by Thomas Fan.

  • Fix Support loading pickles of decision tree models when the pickle has been generated on a platform with a different bitness. A typical example is to train and pickle the model on 64 bit machine and load the model on a 32 bit machine for prediction. #21552 by Loïc Estève.

sklearn.utils

Version 1.0.1

October 2021

Changelog

Fixed models

sklearn.calibration

sklearn.cluster

sklearn.ensemble

sklearn.gaussian_process

sklearn.feature_extraction

  • Efficiency Fixed an efficiency regression introduced in version 1.0.0 in the transform method of feature_extraction.text.CountVectorizer which no longer checks for uppercase characters in the provided vocabulary. #21251 by Jérémie du Boisberranger.

  • Fix Fixed a bug in feature_extraction.CountVectorizer and feature_extraction.TfidfVectorizer by raising an error when ‘min_idf’ or ‘max_idf’ are floating-point numbers greater than 1. #20752 by Alek Lefebvre.

sklearn.linear_model

sklearn.neighbors

sklearn.pipeline

sklearn.svm

sklearn.utils

Miscellaneous

  • Fix Fitting an estimator on a dataset that has no feature names, that was previously fitted on a dataset with feature names no longer keeps the old feature names stored in the feature_names_in_ attribute. #21389 by Jérémie du Boisberranger.

Version 1.0.0

September 2021

For a short description of the main highlights of the release, please refer to Release Highlights for scikit-learn 1.0.

Legend for changelogs

  • Major Feature : something big that you couldn’t do before.

  • Feature : something that you couldn’t do before.

  • Efficiency : an existing feature now may not require as much computation or memory.

  • Enhancement : a miscellaneous minor improvement.

  • Fix : something that previously didn’t work as documentated – or according to reasonable expectations – should now work.

  • API Change : you will need to change your code to have the same effect in the future; or a feature will be removed in the future.

Minimal dependencies

Version 1.0.0 of scikit-learn requires python 3.7+, numpy 1.14.6+ and scipy 1.1.0+. Optional minimal dependency is matplotlib 2.2.2+.

Enforcing keyword-only arguments

In an effort to promote clear and non-ambiguous use of the library, most constructor and function parameters must now be passed as keyword arguments (i.e. using the param=value syntax) instead of positional. If a keyword-only parameter is used as positional, a TypeError is now raised. #15005 #20002 by Joel Nothman, Adrin Jalali, Thomas Fan, Nicolas Hug, and Tom Dupre la Tour. See SLEP009 for more details.

Changed models

The following estimators and functions, when fit with the same data and parameters, may produce different models from the previous version. This often occurs due to changes in the modelling logic (bug fixes or enhancements), or in random sampling procedures.

Details are listed in the changelog below.

(While we are trying to better inform users by providing this information, we cannot assure that this list is complete.)

Changelog

  • API Change The option for using the squared error via loss and criterion parameters was made more consistent. The preferred way is by setting the value to "squared_error". Old option names are still valid, produce the same models, but are deprecated and will be removed in version 1.2. #19310 by Christian Lorentzen.

  • API Change The option for using the absolute error via loss and criterion parameters was made more consistent. The preferred way is by setting the value to "absolute_error". Old option names are still valid, produce the same models, but are deprecated and will be removed in version 1.2. #19733 by Christian Lorentzen.

  • API Change np.matrix usage is deprecated in 1.0 and will raise a TypeError in 1.2. #20165 by Thomas Fan.

  • API Change get_feature_names_out has been added to the transformer API to get the names of the output features. get_feature_names has in turn been deprecated. #18444 by Thomas Fan.

  • API Change All estimators store feature_names_in_ when fitted on pandas Dataframes. These feature names are compared to names seen in non-fit methods, e.g. transform and will raise a FutureWarning if they are not consistent. These FutureWarning s will become ValueError s in 1.2. #18010 by Thomas Fan.

sklearn.base

sklearn.calibration

sklearn.cluster

sklearn.compose

sklearn.covariance

  • Fix Adds arrays check to covariance.ledoit_wolf and covariance.ledoit_wolf_shrinkage. #20416 by Hugo Defois.

  • API Change Deprecates the following keys in cv_results_: 'mean_score', 'std_score', and 'split(k)_score' in favor of 'mean_test_score' 'std_test_score', and 'split(k)_test_score'. #20583 by Thomas Fan.

sklearn.datasets

  • Enhancement datasets.fetch_openml now supports categories with missing values when returning a pandas dataframe. #19365 by Thomas Fan and Amanda Dsouza and EL-ATEIF Sara.

  • Enhancement datasets.fetch_kddcup99 raises a better message when the cached file is invalid. #19669 Thomas Fan.

  • Enhancement Replace usages of __file__ related to resource file I/O with importlib.resources to avoid the assumption that these resource files (e.g. iris.csv) already exist on a filesystem, and by extension to enable compatibility with tools such as PyOxidizer. #20297 by Jack Liu.

  • Fix Shorten data file names in the openml tests to better support installing on Windows and its default 260 character limit on file names. #20209 by Thomas Fan.

  • Fix datasets.fetch_kddcup99 returns dataframes when return_X_y=True and as_frame=True. #19011 by Thomas Fan.

  • API Change Deprecates datasets.load_boston in 1.0 and it will be removed in 1.2. Alternative code snippets to load similar datasets are provided. Please report to the docstring of the function for details. #20729 by Guillaume Lemaitre.

sklearn.decomposition

sklearn.dummy

sklearn.ensemble

sklearn.feature_extraction

sklearn.feature_selection

sklearn.inspection

sklearn.kernel_approximation

sklearn.linear_model

sklearn.manifold

  • Enhancement Implement 'auto' heuristic for the learning_rate in manifold.TSNE. It will become default in 1.2. The default initialization will change to pca in 1.2. PCA initialization will be scaled to have standard deviation 1e-4 in 1.2. #19491 by Dmitry Kobak.

  • Fix Change numerical precision to prevent underflow issues during affinity matrix computation for manifold.TSNE. #19472 by Dmitry Kobak.

  • Fix manifold.Isomap now uses scipy.sparse.csgraph.shortest_path to compute the graph shortest path. It also connects disconnected components of the neighbors graph along some minimum distance pairs, instead of changing every infinite distances to zero. #20531 by Roman Yurchak and Tom Dupre la Tour.

  • Fix Decrease the numerical default tolerance in the lobpcg call in manifold.spectral_embedding to prevent numerical instability. #21194 by Andrew Knyazev.

sklearn.metrics

sklearn.mixture

sklearn.model_selection

sklearn.naive_bayes

sklearn.neighbors

sklearn.neural_network

sklearn.pipeline

sklearn.preprocessing

sklearn.svm

sklearn.tree

sklearn.utils

Code and Documentation Contributors

Thanks to everyone who has contributed to the maintenance and improvement of the project since version 0.24, including:

Abdulelah S. Al Mesfer, Abhinav Gupta, Adam J. Stewart, Adam Li, Adam Midvidy, Adrian Garcia Badaracco, Adrian Sadłocha, Adrin Jalali, Agamemnon Krasoulis, Alberto Rubiales, Albert Thomas, Albert Villanova del Moral, Alek Lefebvre, Alessia Marcolini, Alexandr Fonari, Alihan Zihna, Aline Ribeiro de Almeida, Amanda, Amanda Dsouza, Amol Deshmukh, Ana Pessoa, Anavelyz, Andreas Mueller, Andrew Delong, Ashish, Ashvith Shetty, Atsushi Nukariya, Aurélien Geron, Avi Gupta, Ayush Singh, baam, BaptBillard, Benjamin Pedigo, Bertrand Thirion, Bharat Raghunathan, bmalezieux, Brian Rice, Brian Sun, Bruno Charron, Bryan Chen, bumblebee, caherrera-meli, Carsten Allefeld, CeeThinwa, Chiara Marmo, chrissobel, Christian Lorentzen, Christopher Yeh, Chuliang Xiao, Clément Fauchereau, cliffordEmmanuel, Conner Shen, Connor Tann, David Dale, David Katz, David Poznik, Dimitri Papadopoulos Orfanos, Divyanshu Deoli, dmallia17, Dmitry Kobak, DS_anas, Eduardo Jardim, EdwinWenink, EL-ATEIF Sara, Eleni Markou, EricEllwanger, Eric Fiegel, Erich Schubert, Ezri-Mudde, Fatos Morina, Felipe Rodrigues, Felix Hafner, Fenil Suchak, flyingdutchman23, Flynn, Fortune Uwha, Francois Berenger, Frankie Robertson, Frans Larsson, Frederick Robinson, frellwan, Gabriel S Vicente, Gael Varoquaux, genvalen, Geoffrey Thomas, geroldcsendes, Gleb Levitskiy, Glen, Glòria Macià Muñoz, gregorystrubel, groceryheist, Guillaume Lemaitre, guiweber, Haidar Almubarak, Hans Moritz Günther, Haoyin Xu, Harris Mirza, Harry Wei, Harutaka Kawamura, Hassan Alsawadi, Helder Geovane Gomes de Lima, Hugo DEFOIS, Igor Ilic, Ikko Ashimine, Isaack Mungui, Ishaan Bhat, Ishan Mishra, Iván Pulido, iwhalvic, J Alexander, Jack Liu, James Alan Preiss, James Budarz, James Lamb, Jannik, Jeff Zhao, Jennifer Maldonado, Jérémie du Boisberranger, Jesse Lima, Jianzhu Guo, jnboehm, Joel Nothman, JohanWork, John Paton, Jonathan Schneider, Jon Crall, Jon Haitz Legarreta Gorroño, Joris Van den Bossche, José Manuel Nápoles Duarte, Juan Carlos Alfaro Jiménez, Juan Martin Loyola, Julien Jerphanion, Julio Batista Silva, julyrashchenko, JVM, Kadatatlu Kishore, Karen Palacio, Kei Ishikawa, kmatt10, kobaski, Kot271828, Kunj, KurumeYuta, kxytim, lacrosse91, LalliAcqua, Laveen Bagai, Leonardo Rocco, Leonardo Uieda, Leopoldo Corona, Loic Esteve, LSturtew, Luca Bittarello, Luccas Quadros, Lucy Jiménez, Lucy Liu, ly648499246, Mabu Manaileng, Manimaran, makoeppel, Marco Gorelli, Maren Westermann, Mariangela, Maria Telenczuk, marielaraj, Martin Hirzel, Mateo Noreña, Mathieu Blondel, Mathis Batoul, mathurinm, Matthew Calcote, Maxime Prieur, Maxwell, Mehdi Hamoumi, Mehmet Ali Özer, Miao Cai, Michal Karbownik, michalkrawczyk, Mitzi, mlondschien, Mohamed Haseeb, Mohamed Khoualed, Muhammad Jarir Kanji, murata-yu, Nadim Kawwa, Nanshan Li, naozin555, Nate Parsons, Neal Fultz, Nic Annau, Nicolas Hug, Nicolas Miller, Nico Stefani, Nigel Bosch, Nikita Titov, Nodar Okroshiashvili, Norbert Preining, novaya, Ogbonna Chibuike Stephen, OGordon100, Oliver Pfaffel, Olivier Grisel, Oras Phongpanangam, Pablo Duque, Pablo Ibieta-Jimenez, Patric Lacouth, Paulo S. Costa, Paweł Olszewski, Peter Dye, PierreAttard, Pierre-Yves Le Borgne, PranayAnchuri, Prince Canuma, putschblos, qdeffense, RamyaNP, ranjanikrishnan, Ray Bell, Rene Jean Corneille, Reshama Shaikh, ricardojnf, RichardScottOZ, Rodion Martynov, Rohan Paul, Roman Lutz, Roman Yurchak, Samuel Brice, Sandy Khosasi, Sean Benhur J, Sebastian Flores, Sebastian Pölsterl, Shao Yang Hong, shinehide, shinnar, shivamgargsya, Shooter23, Shuhei Kayawari, Shyam Desai, simonamaggio, Sina Tootoonian, solosilence, Steven Kolawole, Steve Stagg, Surya Prakash, swpease, Sylvain Marié, Takeshi Oura, Terence Honles, TFiFiE, Thomas A Caswell, Thomas J. Fan, Tim Gates, TimotheeMathieu, Timothy Wolodzko, Tim Vink, t-jakubek, t-kusanagi, tliu68, Tobias Uhmann, tom1092, Tomás Moreyra, Tomás Ronald Hughes, Tom Dupré la Tour, Tommaso Di Noto, Tomohiro Endo, TONY GEORGE, Toshihiro NAKAE, tsuga, Uttam kumar, vadim-ushtanit, Vangelis Gkiastas, Venkatachalam N, Vilém Zouhar, Vinicius Rios Fuck, Vlasovets, waijean, Whidou, xavier dupré, xiaoyuchai, Yasmeen Alsaedy, yoch, Yosuke KOBAYASHI, Yu Feng, YusukeNagasaka, yzhenman, Zero, ZeyuSun, ZhaoweiWang, Zito, Zito Relova