Version 1.8#
Legend for changelogs
Major Feature something big that you couldn’t do before.
Feature something that you couldn’t do before.
Efficiency an existing feature now may not require as much computation or memory.
Enhancement a miscellaneous minor improvement.
Fix something that previously didn’t work as documented – or according to reasonable expectations – should now work.
API Change you will need to change your code to have the same effect in the future; or a feature will be removed in the future.
Version 1.8.dev0#
October 2025
Changes impacting many modules#
Support for Array API#
Additional estimators and functions have been updated to include support for all Array API compliant inputs.
See Array API support (experimental) for more details.
Feature
sklearn.preprocessing.StandardScalernow supports Array API compliant inputs. By Alexander Fabisch, Edoardo Abati, Olivier Grisel and Charles Hill. #27113Feature
linear_model.RidgeCV,linear_model.RidgeClassifierandlinear_model.RidgeClassifierCVnow support array API compatible inputs withsolver="svd". By Jérôme Dockès. #27961Feature
metrics.pairwise.pairwise_kernelsfor any kernel except “laplacian” andmetrics.pairwise_distancesfor metrics “cosine”, “euclidean” and “l2” now support array API inputs. By Emily Chen and Lucy Liu #29822Feature
sklearn.metrics.confusion_matrixnow supports Array API compatible inputs. By Stefanie Senger #30562Feature
sklearn.mixture.GaussianMixturewithinit_params="random"orinit_params="random_from_data"andwarm_start=Falsenow supports Array API compatible inputs. By Stefanie Senger and Loïc Estève #30777Feature
sklearn.metrics.roc_curvenow supports Array API compatible inputs. By Thomas Li #30878Feature
preprocessing.PolynomialFeaturesnow supports array API compatible inputs. By Omar Salman #31580Feature
sklearn.metrics.precision_recall_curvenow supports array API compatible inputs. By Lucy Liu #32249Feature
sklearn.model_selection.cross_val_predictnow supports array API compatible inputs. By Omar Salman #32270Feature
sklearn.metrics.brier_score_loss,sklearn.metrics.log_loss,sklearn.metrics.d2_brier_scoreandsklearn.metrics.d2_log_loss_scorenow support array API compatible inputs. By Omar Salman #32422
Metadata routing#
Refer to the Metadata Routing User Guide for more details.
Fix Fixed an issue where passing
sample_weightto aPipelineinside aGridSearchCVwould raise an error with metadata routing enabled. By Adrin Jalali. #31898
sklearn.base#
Feature Refactored
dirinBaseEstimatorto recognize condition check inavailable_if. By John Hendricks and Miguel Parece. #31928Fix Fixed the handling of pandas missing values in HTML display of all estimators. By :user:
Dea María Léon <deamarialeon>. #32341
sklearn.calibration#
Feature Added temperature scaling method in
calibration.CalibratedClassifierCV. By Virgil Chan and Christian Lorentzen. #31068
sklearn.cluster#
Efficiency
cluster.kmeans_plusplusnow usesnp.cumsumdirectly without extra numerical stability checks and without casting tonp.float64. By Tiziano Zito #31991Fix The default value of the
copyparameter incluster.HDBSCANwill change fromFalsetoTruein 1.10 to avoid data modification and maintain consistency with other estimators. By Sarthak Puri. #31973
sklearn.compose#
Fix The
compose.ColumnTransformernow correctly fits on data provided as apolars.DataFramewhen any transformer has a sparse output. By Phillipp Gnan. #32188
sklearn.covariance#
Efficiency
sklearn.covariance.GraphicalLasso,sklearn.covariance.GraphicalLassoCVandsklearn.covariance.graphical_lassowithmode="cd"profit from the fit time performance improvement ofsklearn.linear_model.Lassoby means of gap safe screening rules. By Christian Lorentzen. #31987Fix Fixed uncontrollable randomness in
sklearn.covariance.GraphicalLasso,sklearn.covariance.GraphicalLassoCVandsklearn.covariance.graphical_lasso. Formode="cd", they now use cyclic coordinate descent. Before, it was random coordinate descent with uncontrollable random number seeding. By Christian Lorentzen. #31987
sklearn.decomposition#
Efficiency
sklearn.decomposition.DictionaryLearningandsklearn.decomposition.MiniBatchDictionaryLearningwithfit_algorithm="cd",sklearn.decomposition.SparseCoderwithtransform_algorithm="lasso_cd",sklearn.decomposition.MiniBatchSparsePCA,sklearn.decomposition.SparsePCA,sklearn.decomposition.dict_learningandsklearn.decomposition.dict_learning_onlinewithmethod="cd",sklearn.decomposition.sparse_encodewithalgorithm="lasso_cd"all profit from the fit time performance improvement ofsklearn.linear_model.Lassoby means of gap safe screening rules. By Christian Lorentzen. #31987Enhancement
decomposition.SparseCodernow follows the transformer API of scikit-learn. In addition, thefitmethod now validates the input and parameters. By François Paugam. #32077Fix Add input checks to the
inverse_transformmethod ofdecomposition.PCAanddecomposition.IncrementalPCA. #29310 by Ian Faust. #29310
sklearn.ensemble#
Fix
ensemble.BaggingClassifier,ensemble.BaggingRegressorandensemble.IsolationForestnow usesample_weightto draw the samples instead of forwarding them multiplied by a uniformly sampled mask to the underlying estimators. Furthermore,max_samplesis now interpreted as a fraction ofsample_weight.sum()instead ofX.shape[0]when passed as a float. By Antoine Baker. #31414
sklearn.feature_selection#
Enhancement
feature_selection.SelectFromModelnow does not forcemax_featuresto be less than or equal to the number of input features. By Thibault #31939
sklearn.gaussian_process#
Efficiency make
GaussianProcessRegressor.predictfaster whenreturn_covandreturn_stdare bothFalse. By Rafael Ayllón Gavilán. #31431
sklearn.linear_model#
Efficiency
linear_model.ElasticNetandlinear_model.Lassowithprecompute=Falseuse less memory for denseXand are a bit faster. Previously, they used twice the memory ofXeven for Fortran-contiguousX. By Christian Lorentzen #31665Efficiency
linear_model.ElasticNetandlinear_model.Lassoavoid double input checking and are therefore a bit faster. By Christian Lorentzen. #31848Efficiency
linear_model.ElasticNet,linear_model.ElasticNetCV,linear_model.Lasso,linear_model.LassoCV,linear_model.MultiTaskElasticNet,linear_model.MultiTaskElasticNetCV,linear_model.MultiTaskLassoandlinear_model.MultiTaskLassoCVare faster to fit by avoiding a BLAS level 1 (axpy) call in the innermost loop. Same for functionslinear_model.enet_pathandlinear_model.lasso_path. By Christian Lorentzen #31956 and #31880Efficiency
linear_model.ElasticNetCV,linear_model.LassoCV,linear_model.MultiTaskElasticNetCVandlinear_model.MultiTaskLassoCVavoid an additional copy ofXwith defaultcopy_X=True. By Christian Lorentzen. #31946Efficiency
linear_model.ElasticNet,linear_model.ElasticNetCV,linear_model.Lasso,linear_model.LassoCV,linear_model.MultiTaskElasticNetCV,linear_model.MultiTaskLassoCVas well aslinear_model.lasso_pathandlinear_model.enet_pathnow implement gap safe screening rules in the coordinate descent solver for dense and sparseX. The speedup of fitting time is particularly pronounced (10-times is possible) when computing regularization paths like the *CV-variants of the above estimators do. There is now an additional check of the stopping criterion before entering the main loop of descent steps. As the stopping criterion requires the computation of the dual gap, the screening happens whenever the dual gap is computed. By Christian Lorentzen #31882, #31986, #31987 and #32014Enhancement
linear_model.ElasticNet,linear_model.ElasticNetCV,linear_model.Lasso,linear_model.LassoCV,MultiTaskElasticNet,MultiTaskElasticNetCV,MultiTaskLasso,MultiTaskLassoCV, as well aslinear_model.enet_pathandlinear_model.lasso_pathnow usedual gap <= tolinstead ofdual gap < tolas stopping criterion. The resulting coefficients might differ to previous versions of scikit-learn in rare cases. By Christian Lorentzen. #31906Fix Fix the convergence criteria for SGD models, to avoid premature convergence when
tol != None. This primarily impactsSGDOneClassSVMbut also affectsSGDClassifierandSGDRegressor. Before this fix, only the loss function without penalty was used as the convergence check, whereas now, the full objective with regularization is used. By Guillaume Lemaitre and kostayScr #31856Fix The allowed parameter range for the initial learning rate
eta0inlinear_model.SGDClassifier,linear_model.SGDOneClassSVM,linear_model.SGDRegressorandlinear_model.Perceptronchanged from non-negative numbers to strictly positive numbers. As a consequence, the defaulteta0oflinear_model.SGDClassifierandlinear_model.SGDOneClassSVMchanged from 0 to 0.01. But note thateta0is not used by the default learning rate “optimal” of those two estimators. By Christian Lorentzen. #31933API Change
linear_model.PassiveAggressiveClassifierandlinear_model.PassiveAggressiveRegressorare deprecated and will be removed in 1.10. Equivalent estimators are available withlinear_model.SGDClassifierandSGDRegressor, both of which expose the optionslearning_rate="pa1"and"pa2". The parametereta0can be used to specify the aggressiveness parameter of the Passive-Aggressive-Algorithms, called C in the reference paper. By Christian Lorentzen #31932 and #29097API Change
linear_model.SGDClassifier,linear_model.SGDRegressor, andlinear_model.SGDOneClassSVMnow deprecate negative values for thepower_tparameter. Using a negative value will raise a warning in version 1.8 and will raise an error in version 1.10. A value in the range [0.0, inf) must be used instead. By Ritvi Alagusankar #31474
sklearn.manifold#
Major Feature
manifold.ClassicalMDSwas implemented to perform classical MDS (eigendecomposition of the double-centered distance matrix). By Dmitry Kobak and Meekail Zain #31322Feature
manifold.TSNEnow supports PCA initialization with sparse input matrices. By Arturo Amor. #32433
sklearn.metrics#
Feature
metrics.d2_brier_scorehas been added which calculates the D^2 for the Brier score. By Omar Salman. #28971Efficiency Avoid redundant input validation in
metrics.d2_log_loss_scoreleading to a 1.2x speedup in large scale benchmarks. By Olivier Grisel and Omar Salman #32356Enhancement
metrics.median_absolute_errornow supports Array API compatible inputs. By Lucy Liu. #31406Enhancement Improved the error message for sparse inputs for the following metrics:
metrics.accuracy_score,metrics.multilabel_confusion_matrix,metrics.jaccard_score,metrics.zero_one_loss,metrics.f1_score,metrics.fbeta_score,metrics.precision_recall_fscore_support,metrics.class_likelihood_ratios,metrics.precision_score,metrics.recall_score,metrics.classification_report,metrics.hamming_loss. By Lucy Liu. #32047Fix
metrics.median_absolute_errornow uses_averaged_weighted_percentileinstead of_weighted_percentileto calculate median whensample_weightis notNone. This is equivalent to using the “averaged_inverted_cdf” instead of the “inverted_cdf” quantile method, which gives results equivalent tonumpy.medianif equal weights used. By Lucy Liu #30787Fix Additional
sample_weightchecking has been added tometrics.accuracy_score,metrics.balanced_accuracy_score,metrics.brier_score_loss,metrics.class_likelihood_ratios,metrics.classification_report,metrics.cohen_kappa_score,metrics.confusion_matrix,metrics.f1_score,metrics.fbeta_score,metrics.hamming_loss,metrics.jaccard_score,metrics.matthews_corrcoef,metrics.multilabel_confusion_matrix,metrics.precision_recall_fscore_support,metrics.precision_score,metrics.recall_scoreandmetrics.zero_one_loss.sample_weightcan only be 1D, consistent toy_trueandy_predin length,and all values must be finite and not complex. By Lucy Liu. #31701Fix
y_predis deprecated in favour ofy_scoreinmetrics.DetCurveDisplay.from_predictionsandmetrics.PrecisionRecallDisplay.from_predictions.y_predwill be removed in v1.10. By Luis #31764Fix
repron a scorer which has been created with apartialscore_funcnow correctly works and uses thereprof the givenpartialobject. By Adrin Jalali. #31891Fix Registered named scorer objects for
metrics.d2_brier_scoreandmetrics.d2_log_loss_scoreand updated their input validation to be consistent with related metric functions. By Olivier Grisel and Omar Salman #32356API Change
metrics.cluster.entropyis deprecated and will be removed in v1.10. By Lucy Liu #31294API Change The
estimator_nameparameter is deprecated in favour ofnameinmetrics.PrecisionRecallDisplayand will be removed in 1.10. By Lucy Liu. #32310
sklearn.model_selection#
- Enhancement
model_selection.StratifiedShuffleSplitwill now specify which classes have too few members when raising a
ValueErrorif any class has less than 2 members. This is useful to identify which classes are causing the error. By Marc Bresson #32265
- Enhancement
sklearn.multiclass#
Fix Fix tie-breaking behavior in
multiclass.OneVsRestClassifierto matchnp.argmaxtie-breaking behavior. By Lakshmi Krishnan. #15504
sklearn.preprocessing#
Enhancement
preprocessing.SplineTransformercan now handle missing values with the parameterhandle_missing. By Stefanie Senger. #28043Enhancement The
preprocessing.PowerTransformernow returns a warning when NaN values are encountered in the inverse transform,inverse_transform, typically caused by extremely skewed data. By Roberto Mourao #29307Enhancement
preprocessing.MaxAbsScalercan now clip out-of-range values in held-out data with the parameterclip. By Hleb Levitski. #31790
sklearn.semi_supervised#
Fix User written kernel results are now normalized in
semi_supervised.LabelPropagationso all row sums equal 1 even if kernel gives asymmetric or non-uniform row sums. By Dan Schult. #31924
sklearn.tree#
Fix Make
tree.export_textthread-safe. By Olivier Grisel. #30041Fix
export_graphviznow raises aValueErrorif given feature names are not all strings. By Guilherme Peixoto #31036Fix Fixed a regression in decision trees where almost constant features were not handled properly. By Sercan Turkmen. #32259
Fix Fix handling of missing values in method
decision_pathof trees (ensemble.DecisionTreeClassifier,ensemble.DecisionTreeRegressor,ensemble.ExtraTreeClassifierandensemble.ExtraTreeRegressor) By Arthur Lacote. #32280
sklearn.utils#
Efficiency The function
sklearn.utils.extmath.safe_sparse_dotwas improved by a dedicated Cython routine for the case ofa @ bwith sparse 2-dimensionalaandband when a dense output is required, i.e.,dense_output=True. This improves several algorithms in scikit-learn when dealing with sparse arrays (or matrices). By Christian Lorentzen. #31952Enhancement The parameter table in the HTML representation of all scikit-learn estimators and more generally of estimators inheriting from
base.BaseEstimatornow displays the parameter description as a tooltip and has a link to the online documentation for each parameter. By Dea María Léon. #31564Enhancement
sklearn.utils._check_sample_weightnow raises a clearer error message when the provided weights are neither a scalar nor a 1-D array-like of the same size as the input data. By Kapil Parekh. #31873Enhancement
sklearn.utils.estimator_checks.parametrize_with_checksnow lets you configure strict mode for xfailing checks. Tests that unexpectedly pass will lead to a test failure. The default behaviour is unchanged. By Tim Head. #31951Enhancement Fixed the alignment of the “?” and “i” symbols and improved the color style of the HTML representation of estimators. By Guillaume Lemaitre. #31969
Fix Changes the way color are chosen when displaying an estimator as an HTML representation. Colors are not adapted anymore to the user’s theme, but chosen based on theme declared color scheme (light or dark) for VSCode and JupyterLab. If theme does not declare a color scheme, scheme is chosen according to default text color of the page, if it fails fallbacks to a media query. By Matt J.. #32330
API Change :function:`utils.extmath.stable_cumsum` is deprecated and will be removed in v1.10. Use
np.cumulative_sumwith the desired dtype directly instead. By Tiziano Zito #32258. #32258
Code and documentation contributors
Thanks to everyone who has contributed to the maintenance and improvement of the project since version 1.7, including:
TODO: update at the time of the release.