Legend for changelogs¶
Major Feature : something big that you couldn’t do before.
Feature : something that you couldn’t do before.
Efficiency : an existing feature now may not require as much computation or memory.
Enhancement : a miscellaneous minor improvement.
Fix : something that previously didn’t work as documentated – or according to reasonable expectations – should now work.
API Change : you will need to change your code to have the same effect in the future; or a feature will be removed in the future.
Put the changes in their relevant module.
The following estimators and functions, when fit with the same data and parameters, may produce different models from the previous version. This often occurs due to changes in the modelling logic (bug fixes or enhancements), or in random sampling procedures.
Details are listed in the changelog below.
(While we are trying to better inform users by providing this information, we cannot assure that this list is complete.)
cluster.AgglomerativeClusteringhas a new parameter
compute_distances. When set to
True, distances between clusters are computed and stored in the
distances_attribute even when the parameter
distance_thresholdis not used. This new parameter is useful to produce dendrogram visualizations, but introduces a computational and memory overhead. #17984 by Michael Riedmann, Emilie Delattre, and Francesco Casalegno.
cluster.spectral_clusteringhave a new keyword argument
verbose. When set to
True, additional messages will be displayed which can aid with debugging. #18052 by Sean O. Stalley.
datasets.fetch_covtypenow now supports the optional argument
as_frame; when it is set to True, the returned Bunch object’s
framemembers are pandas DataFrames, and the
targetmember is a pandas Series. #17491 by Alex Liang.
decomposition.SparseCodersuch that it follows scikit-learn API and support cloning. The attribute
components_is deprecated in 0.24 and will be removed in 0.26. This attribute was redundant with the
dictionaryattribute and constructor parameter. #17679 by Xavier Dupré.
decomposition.NMFnow supports the optional parameter
regularization, which can take the values
both, in accordance with
decomposition.NMF.non_negative_factorization. #17414 by Bharat Raghunathan.
Feature A new parameter
importance_getterwas added to
feature_selection.SelectFromModel, allowing the user to specify an attribute name/path or a
callablefor extracting feature importance from the estimator. #15361 by Venkatachalam N
Efficiency Reduce memory footprint in
neighbors.KDTreefor counting nearest neighbors. #17878 by Noel Rogers
Fix replace the default values in
np.inf, respectively instead of
None. However, the behaviour of the class does not change since
Nonewas defaulting to these values already. #16493 by Darshan N.
Feature Expose fitted attributes
y_thresholds_that hold the de-duplicated interpolation thresholds of an
isotonic.IsotonicRegressioninstance for model inspection purpose. #16289 by Masashi Kishimoto and Olivier Grisel.
linear_model.RidgeCVnow supports finding an optimal regularization value
alphafor each target separately by setting
alpha_per_target=True. This is only supported when using the default efficient leave-one-out cross-validation scheme
cv=None. #6624 by Marijn van Vliet.
manifold.TSNE, which provides backward compatibility during deprecation of legacy squaring behavior. Distances will be squared by default in 0.26, and this parameter will be removed in 0.28. #17662 by Joshua Newton.
metrics.mean_absolute_percentage_errormetric and the associated scorer for regression problems. #10708 fixed with the PR #15007 by Ashutosh Hathidara. The scorer and some practical test cases were taken from PR #10711 by Mohamed Ali Jamaoui.
metrics.plot_precision_recall_curvein order to specify the positive class to be used when computing the precision and recall statistics. #17569 by Guillaume Lemaitre.
model_selection.TimeSeriesSplithas two new keyword arguments
test_sizeallows the out-of-sample time series length to be fixed for all folds.
gapremoves a fixed number of samples between the train and test set on each fold. #13204 by Kyle Kosic.
Efficiency Speed up
neighbors.DistanceMetricby avoiding unexpected GIL acquiring in Cython when setting
metrics.pairwise_distancesand by validating data out of loops. #17038 by Wenbo Zhao.
neighbors.NeighborsBasebenefits of an improved
algorithm = 'auto'heuristic. In addition to the previous set of rules, now, when the number of features exceeds 15,
bruteis selected, assuming the data intrinsic dimensionality is too high for tree-based methods. #17148 by Geoffrey Bolmier.
Feature Add a new
handle_unknownparameter with a
use_encoded_valueoption, along with a new
preprocessing.OrdinalEncoderto allow unknown categories during transform and set the encoded value of the unknown categories. #17406 by Felix Wick.
Code and Documentation Contributors¶
Thanks to everyone who has contributed to the maintenance and improvement of the project since version 0.20, including: