Version 1.9#
Legend for changelogs
Major Feature something big that you couldn’t do before.
Feature something that you couldn’t do before.
Efficiency an existing feature now may not require as much computation or memory.
Enhancement a miscellaneous minor improvement.
Fix something that previously didn’t work as documented – or according to reasonable expectations – should now work.
API Change you will need to change your code to have the same effect in the future; or a feature will be removed in the future.
Version 1.9.dev0#
February 2026
Changes impacting many modules#
Enhancement
pipeline.Pipeline,pipeline.FeatureUnionandcompose.ColumnTransformernow raise a clearer error message when an estimator class is passed instead of an instance. By Anne Beyer #32888Fix Raise ValueError when
sample_weightcontains only zero values to prevent meaningless input data during fitting. This change applies to all estimators that support the parametersample_weight. This change also affects metrics that validate sample weights. By Lucy Liu and John Hendricks. #32212Fix Some parameter descriptions in the HTML representation of estimators were not properly escaped, which could lead to malformed HTML if the description contains characters like
<or>. By Olivier Grisel. #32942
Support for Array API#
Additional estimators and functions have been updated to include support for all Array API compliant inputs.
See Array API support (experimental) for more details.
Feature
sklearn.metrics.d2_absolute_error_scoreandsklearn.metrics.d2_pinball_scorenow support array API compatible inputs. By Virgil Chan. #31671Feature
sklearn.metrics.ranking.average_precision_scorenow supports Array API compliant inputs. By Stefanie Senger. #32909Feature
sklearn.metrics.pairwise.paired_manhattan_distancesnow supports array API compatible inputs. By Bharat Raghunathan. #32979Feature
sklearn.metrics.pairwise.pairwise_distances_argminnow supports array API compatible inputs. By Bharat Raghunathan. #32985Enhancement
kernel_approximation.Nystroemnow supports array API compatible inputs. By Emily Chen #29661Fix Fixed a bug that would cause Cython-based estimators to fail when fit on NumPy inputs when setting
sklearn.set_config(array_api_dispatch=True). By Olivier Grisel. #32846Fix Fixes how
pos_labelis inferred whenpos_labelis set toNone, insklearn.metrics.brier_score_lossandsklearn.metrics.d2_brier_score. By Lucy Liu. #32923
Metadata routing#
Refer to the Metadata Routing User Guide for more details.
Enhancement
TargetEncodernow routesgroupsto the CV splitter internally used for cross fitting in itsfit_transform. By Samruddhi Baviskar and Stefanie Senger. #33089
sklearn.cluster#
Fix
cluster.MiniBatchKMeansnow correctly handles sample weights during fitting. When sample weights are not None, mini-batch indices are created by sub-sampling with replacement using the normalized sample weights as probabilities. By Shruti Nath, Olivier Grisel, and Jeremie du Boisberranger. #30751
sklearn.compose#
Fix The dotted line for
compose.ColumnTransformerin its HTML display now includes only its elements. The behaviour when a remainder is used, has also been corrected. By Dea María Léon #32713
sklearn.datasets#
Efficiency Re-enabled compressed caching for
datasets.fetch_kddcup99, reducing on-disk cache size without changing the public API. By Unique Shrestha. #33118
sklearn.ensemble#
Fix
ensemble.RandomForestClassifier,ensemble.RandomForestRegressor,ensemble.ExtraTreesClassifierandensemble.ExtraTreesRegressornow usesample_weightto draw the samples instead of forwarding them multiplied by a uniformly sampled mask to the underlying estimators. Furthermore, whenmax_samplesis a float, it is now interpreted as a fraction ofsample_weight.sum()instead ofX.shape[0]. As sampling is done with replacement, a floatmax_samplesgreater than1.0is now allowed, as well as an integermax_samplesgreater thenX.shape[0]. The defaultmax_samples=NonedrawsX.shape[0]samples, irrespective ofsample_weight. By Antoine Baker. #31529Fix Both
ensemble.GradientBoostingRegressorandensemble.GradientBoostingClassifierwith the default"friedman_mse"criterion were computing impurity values with an incorrect scaling, leading to unexpected trees in some cases. The implementation now uses"squared_error", which is exactly equivalent to"friedman_mse"up to floating-point error discrepancies but computes correct impurity values. By Arthur Lacote. #32708API Change The
criterionparameter is now deprecated for classesensemble.GradientBoostingRegressorandensemble.GradientBoostingClassifier, as both options ("friedman_mse"and"squared_error") were producing the same results, up to floating-point rounding discrepancies and a bug in"friedman_mse". By Arthur Lacote #32708
sklearn.inspection#
Fix In
inspection.DecisionBoundaryDisplay,multiclass_colorsis now also used for multiclass plotting whenresponse_method="predict". By Anne Beyer. #33015Fix In
inspection.DecisionBoundaryDisplay,n_classesis now inferred more robustly from the estimator. If it fails for custom estimators, a comprehensive error message is shown. By Anne Beyer. #33202
sklearn.linear_model#
Enhancement
linear_model.ElasticNet,linear_model.ElasticNetCVandlinear_model.enet_pathnow are able to fit Ridge regression, i.e. settingl1_ratio=0. Before this PR, the stopping criterion was a formulation of the dual gap that breaks down forl1_ratio=0. Now, an alternative dual gap formulation is used for this setting. This reduces the noise of raised warnings. By Christian Lorentzen. #32845Enhancement Efficiency
linear_model.ElasticNet,linear_model.ElasticNetCV,linear_model.Lasso,linear_model.LassoCV,linear_model.MultiTaskElasticNet,linear_model.MultiTaskElasticNetCVlinear_model.MultiTaskLasso,linear_model.MultiTaskLassoCVas well aslinear_model.lasso_pathandlinear_model.enet_pathare now faster when fit with strong L1 penalty and many features. During gap safe screening of features, the update of the residual is now only performed if the coefficient is not zero. By Christian Lorentzen. #33161Fix
linear_model.LassoCVandlinear_model.ElasticNetCVnow take thepositiveparameter into account to compute the maximumalphaparameter, where all coefficients are zero. This impacts the search grid for the internally tunedalphahyper-parameter stored in the attributealphas_. By Junteng Li #32768Fix Correct the formulation of
alphawithinlinear_model.SGDOneClassSVM. The corrected value isalpha = nuinstead ofalpha = nu / 2. Note: This might result in changed values for the fitted attributes likecoef_andoffset_as well as the predictions made using this class. By Omar Salman. #32778Fix
linear_model.enet_pathnow correctly handles theprecomputeparameter whencheck_input=False. Previously, the value ofprecomputewas not properly treated which could lead to a ValueError. This also affectslinear_model.ElasticNetCV,linear_model.LassoCV,linear_model.MultiTaskElasticNetCVandlinear_model.MultiTaskLassoCV. By Albert Dorador #33014Fix Fixed a bug in
linear_model.SGDClassifierfor multiclass settings where large negative values of :method:`decision_function` could lead to NaN values. In this case, this fix assigns equal probability for each class. By Christian Lorentzen. #33168
sklearn.metrics#
Enhancement
cohen_kappa_scorenow has areplace_undefined_byparam, that can be set to define the function’s return value when the metric is undefined (division by zero). By Stefanie Senger #31172Fix
metrics.d2_pinball_scoreandmetrics.d2_absolute_error_scorenow always use the"averaged_inverted_cdf"quantile method, both with and without sample weights. Previously, the"linear"quantile method was used only for the unweighted case leading the surprising discrepancies when comparing the results with unit weights. Note that all quantile interpolation methods are asymptotically equivalent in the large sample limit, but this fix can cause score value changes on small evaluation sets (without weights). By Virgil Chan. #31671
sklearn.pipeline#
Fix Fixed
pipeline.FeatureUnionto properly handle column renaming when using Polars output, preventing duplicate column names. By Levente Csibi. #32853 #32853
sklearn.svm#
Fix Raise more informative error when fitting
NuSVRwith all zero sample weights. By Lucy Liu and John Hendricks. #32212
sklearn.tree#
Fix Fixed feature-wise NaN detection in trees. Features could be seen as NaN-free for some edge-case patterns, which led to not considering splits with NaNs assigned to the left node for those features. This affects: -
tree.DecisionTreeRegressor-tree.ExtraTreeRegressor-ensemble.RandomForestRegressor-ensemble.ExtraTreesRegressorBy Arthur Lacote #32193API Change
criterion="friedman_mse"is now deprecated. This criterion was intended for gradient boosting but was incorrectly implemented in scikit-learn’s trees and was actually behaving identically tocriterion="squared_error". Usecriterion="squared_error"instead. This affects: -tree.DecisionTreeRegressor-tree.ExtraTreeRegressor-ensemble.RandomForestRegressor-ensemble.ExtraTreesRegressorBy Arthur Lacote #32708
sklearn.utils#
Enhancement
sklearn.utils._tags.get_tagsnow provides a clearer error message when a class is passed instead of an estimator instance. By Achyuthan S and Anne Beyer. #32565Enhancement
sklearn.utils._response._get_response_valuesnow provides a clearer error message when estimator does not implement the givenresponse_method. By Quentin Barthélemy. #33126Fix The parameter table in the HTML representation of all scikit-learn estimators inheritiging from
base.BaseEstimator, displays each parameter documentation as a tooltip. The last tooltip of a parameter in the last table of any HTML representation was partially hidden. This issue has been fixed. By Dea María Léon #32887- Fix Fixed
_weighted_percentilewithaverage=Trueso zero-weight samples just before the end of the array are handled correctly. This
can change results when using
sample_weightwithpreprocessing.KBinsDiscretizer(strategy="quantile",quantile_method="averaged_inverted_cdf") and inmetrics.median_absolute_error,metrics.d2_pinball_score, andmetrics.d2_absolute_error_score. By Arthur Lacote. #33127- Fix Fixed
Code and documentation contributors
Thanks to everyone who has contributed to the maintenance and improvement of the project since version 1.8, including:
TODO: update at the time of the release.