.. include:: _contributors.rst .. currentmodule:: sklearn .. _release_notes_1_9: =========== Version 1.9 =========== .. -- UNCOMMENT WHEN 1.9.0 IS RELEASED -- For a short description of the main highlights of the release, please refer to :ref:`sphx_glr_auto_examples_release_highlights_plot_release_highlights_1_9_0.py`. .. DELETE WHEN 1.9.0 IS RELEASED Since October 2024, DO NOT add your changelog entry in this file. .. Instead, create a file named `..rst` in the relevant sub-folder in `doc/whats_new/upcoming_changes/`. For full details, see: https://github.com/scikit-learn/scikit-learn/blob/main/doc/whats_new/upcoming_changes/README.md .. include:: changelog_legend.inc .. towncrier release notes start .. _changes_1_9_dev0: Version 1.9.dev0 ================ **April 2026** Changed models -------------- - |Enhancement| The :meth:`transform` method of :class:`preprocessing.PowerTransformer` with `method="yeo-johnson"` now uses the numerical more stable function `scipy.stats.yeojohnson` instead of an own implementation. The results may deviate in numerical edge cases or within the precision of floating-point arithmetic. By :user:`Christian Lorentzen `. :pr:`33272` - |API| The default value of the `scoring` parameter in :class:`linear_model.LogisticRegressionCV` will change in version 1.11 from `None`, i.e. accuracy, to `"neg_log_loss"`. This is a much better default scoring function as it aligns with the log loss that logistic regression is minimizing (with regularization). For the meantime, you can silence the warning for this change by explicitly passing a value to `scoring`. By :user:`Christian Lorentzen `. :pr:`33333` Changes impacting many modules ------------------------------ - |MajorFeature| Introduced a new config key: "sparse_interface" to control whether functions return sparse objects using SciPy sparse matrix or SciPy sparse array. Use `sklearn.set_config(sparse_interface="sparray")` to have sklearn return sparse arrays. See more at `the SciPy Sparse Migration Guide. `_ The scikit-learn config "sparse_interface" initially defaults to sparse matrix ("spmatrix"). The plan is to have the default change to sparse array ("sparray") in a few releases. By :user:`Dan Schult ` :pr:`31177` - |Enhancement| The HTML representation of all scikit-learn estimators inheriting from :class:`base.BaseEstimator` now displays a new block showing the number and names of the output features when using a :class:`compose.ColumnTransformer` or a :class:`pipeline.FeatureUnion`. A copy-paste button is available for the output features name. By :user:`Dea María Léon `, :user:`Guillaume Lemaitre `, :user:`Jérémie du Boisberranger `, :user:`Olivier Grisel `, :user:`Antoine Baker `. :pr:`31937` - |Enhancement| :class:`pipeline.Pipeline`, :class:`pipeline.FeatureUnion` and :class:`compose.ColumnTransformer` now raise a clearer error message when an estimator class is passed instead of an instance. By :user:`Anne Beyer ` :pr:`32888` - |Enhancement| The HTML representation of all scikit-learn estimators inheriting from :class:`base.BaseEstimator` now includes a table displaying their fitted :term:`attributes`. These are all the public estimator attributes that are computed during the call to :term:`fit` with a name that ends with an underscore. By :user:`Dea María Léon `, :user:`Jérémie du Boisberranger `, :user:`Olivier Grisel `, :user:`Guillaume Lemaitre `, :user:`Antoine Baker `. :pr:`33399` - |Fix| Raise ValueError when `sample_weight` contains only zero values to prevent meaningless input data during fitting. This change applies to all estimators that support the parameter `sample_weight`. This change also affects metrics that validate sample weights. By :user:`Lucy Liu ` and :user:`John Hendricks `. :pr:`32212` - |Fix| Some parameter descriptions in the HTML representation of estimators were not properly escaped, which could lead to malformed HTML if the description contains characters like `<` or `>`. By :user:`Olivier Grisel `. :pr:`32942` Support for Array API --------------------- Additional estimators and functions have been updated to include support for all `Array API `_ compliant inputs. See :ref:`array_api` for more details. - |Feature| :func:`sklearn.metrics.d2_absolute_error_score` and :func:`sklearn.metrics.d2_pinball_score` now support array API compatible inputs. By :user:`Virgil Chan `. :pr:`31671` - |Feature| :class:`linear_model.LogisticRegression` now supports array API compatible inputs with `solver="lbfgs"`. By :user:`Omar Salman ` and :user:`Olivier Grisel `. :pr:`32644` - |Feature| :func:`sklearn.metrics.ranking.average_precision_score` now supports Array API compliant inputs. By :user:`Stefanie Senger `. :pr:`32909` - |Feature| :func:`sklearn.metrics.pairwise.paired_manhattan_distances` now supports array API compatible inputs. By :user:`Bharat Raghunathan `. :pr:`32979` - |Feature| :func:`sklearn.metrics.pairwise.pairwise_distances_argmin` now supports array API compatible inputs. By :user:`Bharat Raghunathan `. :pr:`32985` - |Feature| :class:`linear_model.LinearRegression`, :class:`linear_model.Ridge`, :class:`linear_model.RidgeClassifier`, :class:`linear_model.LogisticRegression`, and :class:`discriminant_analysis.LinearDiscriminantAnalysis` now raise a more informative error message when arrays passed at fit and prediction time use different array API namespaces or devices. A new ``sklearn.utils._array_api.move_estimator_to`` utility is provided to move an estimator's fitted array attributes to a different namespace and device. By :user:`Jérôme Dockès ` and :user:`Tim Head `. :pr:`33076` - |Feature| :class:`pipeline.FeatureUnion` now supports Array API compliant inputs when all its transformers do. By :user:`Olivier Grisel `. :pr:`33263` - |Feature| :class:`linear_model.PoissonRegressor` now supports array API compatible inputs with `solver="lbfgs"`. By :user:`Christian Lorentzen ` and :user:`Omar Salman `. :pr:`33348` - |Enhancement| :class:`kernel_approximation.Nystroem` now supports array API compatible inputs. By :user:`Emily Chen ` :pr:`29661` - |Enhancement| :class:`linear_model.RidgeCV` now accepts array API compliant arrays with `gcv_mode` set to `auto` or `eigen`. By :user:`Antoine Baker ` :pr:`33020` - |Enhancement| Internal NumPy CPU conversions now always attempt a generic DLPack-based transfer and only fallback to library-specific methods when necessary. This should ease support for additional array API and DLPack compliant input types without extending the ad hoc conversion helpers. By :user:`Olivier Grisel `. :pr:`33623` - |Fix| Fixed a bug that would cause Cython-based estimators to fail when fit on NumPy inputs when setting `sklearn.set_config(array_api_dispatch=True)`. By :user:`Olivier Grisel `. :pr:`32846` - |Fix| Fixes how `pos_label` is inferred when `pos_label` is set to `None`, in :func:`sklearn.metrics.brier_score_loss` and :func:`sklearn.metrics.d2_brier_score`. By :user:`Lucy Liu `. :pr:`32923` - |Fix| :func:`linear_model.ridge_regression` now correctly passes a Python scalar as ``fill_value`` to ``xp.full`` when broadcasting alpha for multi-target regression, ensuring compliance with the array API specification. This fixes compatibility issues with some array API backends. By :user:`Olivier Grisel `. :pr:`33437` Metadata routing ---------------- Refer to the :ref:`Metadata Routing User Guide ` for more details. - |Enhancement| :class:`~preprocessing.TargetEncoder` now routes `groups` to the :term:`CV splitter` internally used for :term:`cross fitting` in its :meth:`~preprocessing.TargetEncoder.fit_transform`. By :user:`Samruddhi Baviskar ` and :user:`Stefanie Senger `. :pr:`33089` :mod:`sklearn.cluster` ---------------------- - |Enhancement| :class:`cluster.AgglomerativeClustering` and :class:`cluster.FeatureAgglomeration` now accept `metric="l2"` together with `linkage="ward"`. `metric="l2"` is equivalent to `metric="euclidean"`. :pr:`24681` by :user:`Guillaume Lemaitre `. :pr:`24681` - |Fix| :class:`cluster.MiniBatchKMeans` now correctly handles sample weights during fitting. When sample weights are not None, mini-batch indices are created by sub-sampling with replacement using the normalized sample weights as probabilities. By :user:`Shruti Nath `, :user:`Olivier Grisel `, and :user:`Jeremie du Boisberranger `. :pr:`30751` - |Fix| Fixed a bug in :class:`cluster.BisectingKMeans` when using a custom callable `init` with `n_clusters > 2`. By :user:`Mohammad Ahmadullah Khan `. :pr:`33148` :mod:`sklearn.compose` ---------------------- - |Fix| The dotted line for :class:`compose.ColumnTransformer` in its HTML display now includes only its elements. The behaviour when a remainder is used, has also been corrected. By :user:`Dea María Léon ` :pr:`32713` - |Fix| Fixes the regression that a `KeyError` was thrown when using :func:`compose.ColumnTransformer.fit_transform` with metadata routing and `remainder="passthrough"`. By :user:`Anne Beyer `. :pr:`33665` :mod:`sklearn.datasets` ----------------------- - |Efficiency| Re-enabled compressed caching for :func:`datasets.fetch_kddcup99`, reducing on-disk cache size without changing the public API. By :user:`Unique Shrestha `. :pr:`33118` :mod:`sklearn.decomposition` ---------------------------- - |Efficiency| :class:`~sklearn.decomposition.FastICA` with `algorithm='deflation'` and `fun='logcosh'` is now an order of magnitude faster. By :user:`Mohammad Ahmadullah Khan `. :pr:`33269` - |Fix| Fixed a typo (from `"OR"` to `"QR"`) in the list of allowed values for `power_iteration_normalizer` in :class:`decomposition.TruncatedSVD`. By :user:`Olivier Grisel `. :pr:`33492` :mod:`sklearn.ensemble` ----------------------- - |Fix| Fixed the way :class:`ensemble.HistGradientBoostingClassifier` and `ensemble.HistGradientBoostingRegressor` compute their bin edges to properly and consistently handle :term:`sample_weight`. When `sample_weights=None` is passed to `fit` and the number of distinct feature values is less than the specified `max_bins`, the edges are still set to midpoints between consecutive feature values. Otherwise, the bin edges are set to weight-aware quantiles computed using the averaged inverted CDF method. If `n_samples` is larger than the `subsample` parameter, the weights are instead used to subsample the data (with replacement) and the bin edges are set using unweighted quantiles of the subsampled data. By :user:`Shruti Nath ` and :user:`Olivier Grisel ` :pr:`29641` - |Fix| :class:`ensemble.RandomForestClassifier`, :class:`ensemble.RandomForestRegressor`, :class:`ensemble.ExtraTreesClassifier` and :class:`ensemble.ExtraTreesRegressor` now use `sample_weight` to draw the samples instead of forwarding them multiplied by a uniformly sampled mask to the underlying estimators. Furthermore, when `max_samples` is a float, it is now interpreted as a fraction of `sample_weight.sum()` instead of `X.shape[0]`. As sampling is done with replacement, a float `max_samples` greater than `1.0` is now allowed, as well as an integer `max_samples` greater then `X.shape[0]`. The default `max_samples=None` draws `X.shape[0]` samples, irrespective of `sample_weight`. By :user:`Antoine Baker `. :pr:`31529` - |Fix| Both :class:`ensemble.GradientBoostingRegressor` and :class:`ensemble.GradientBoostingClassifier` with the default `"friedman_mse"` criterion were computing impurity values with an incorrect scaling, leading to unexpected trees in some cases. The implementation now uses `"squared_error"`, which is exactly equivalent to `"friedman_mse"` up to floating-point error discrepancies but computes correct impurity values. By :user:`Arthur Lacote `. :pr:`32708` - |API| The `criterion` parameter is now deprecated for classes :class:`ensemble.GradientBoostingRegressor` and :class:`ensemble.GradientBoostingClassifier`, as both options (`"friedman_mse"` and `"squared_error"`) were producing the same results, up to floating-point rounding discrepancies and a bug in `"friedman_mse"`. By :user:`Arthur Lacote ` :pr:`32708` :mod:`sklearn.feature_extraction` --------------------------------- - |Fix| :func:`feature_extraction.image.reconstruct_from_patches_2d` now produces correct results when a patch dimension equals the corresponding image dimension. By :user:`Eden Rochman `. :pr:`33643` :mod:`sklearn.feature_selection` -------------------------------- - |Enhancement| :class:`feature_selection.SelectFromModel` and :class:`feature_selection.RFE` now support estimators whose feature importance is a sparse matrix or array, notably by passing a user-defined callable to the parameter `importance_getter`. By :user:`andymucyo-ops ` and :user:`isaacambrogetti `. :pr:`33786` - |Fix| :class:`feature_selection.RFE` now uses stable sorting when ranking feature importances. This ensures that the feature selection is deterministic and consistent across runs when feature importances are tied. By :user:`blitchj `. :pr:`29532` :mod:`sklearn.gaussian_process` ------------------------------- - |Efficiency| Constructor signature of Gaussian process kernels is now cached, improving performance on small and medium datasets. By :user:`Stanislav Terliakov ` :pr:`33067` - |Fix| The hyperparameters of the default kernel of :class:`~sklearn.gaussian_process.GaussianProcessRegressor`, namely `ConstantKernel() * RBF()`, are now optimized when `optimizer` is not `None`. Thus, `gpr = GaussianProcessRegressor().fit(X, y)` uses optimized kernel hyperparameters. By :user:`Matthias De Lozzo `. :pr:`32964` :mod:`sklearn.inspection` ------------------------- - |Enhancement| In :class:`inspection.DecisionBoundaryDisplay`, `multiclass_colors` now defaults to the more accessible [Petroff color sequence](https://arxiv.org/abs/2107.02270) for multiclass problems with up to 10 classes. By :user:`Anne Beyer `. :pr:`33709` - |Fix| In :class:`inspection.DecisionBoundaryDisplay`, `multiclass_colors` is now also used for multiclass plotting when `response_method="predict"`. By :user:`Anne Beyer `. :pr:`33015` - |Fix| In :class:`inspection.DecisionBoundaryDisplay`, `n_classes` is now inferred more robustly from the estimator. If it fails for custom estimators, a comprehensive error message is shown. By :user:`Anne Beyer `. :pr:`33202` - |Fix| :class:`inspection.DecisionBoundaryDisplay` now displays all class boundaries when using ``plot_method="contour"`` with all response_methods, and displays all classes in distinct colors when using ``plot_method="contourf"`` with ``response_method="predict"``. By :user:`Anne Beyer ` and :user:`Levente Csibi `. :pr:`33300` - |Fix| In :class:`inspection.DecisionBoundaryDisplay`, a `ValueError` is now raised if the colormap passed to `multiclass_colors` contains fewer colors than there are classes in multiclass problems. By :user:`Anne Beyer `. :pr:`33419` - |Fix| For multiclass data, :class:`inspection.DecisionBoundaryDisplay` with ``plot_method="contour"`` now also displays class-specific contours for ``response_method="predict_proba"`` and ``response_method="decision_function"``. Multiclass class boundary contour lines are now displayed in black by default for all response methods to avoid confusion. By :user:`Anne Beyer `. :pr:`33471` - |Fix| In :class:`inspection.DecisionBoundaryDisplay`, `multiclass_colors_` now always stores the colors for multiclass problems as a numpy array. By :user:`Anne Beyer `. :pr:`33651` :mod:`sklearn.linear_model` --------------------------- - |Feature| :class:`linear_model.MultiTaskElasticNet`, :class:`linear_model.MultiTaskElasticNetCV`, :class:`linear_model.MultiTaskLasso`, and :class:`linear_model.MultiTaskLassoCV` now support fitting on sparse `X` as well as fitting with `sample_weight`. By :user:`Christian Lorentzen `. :pr:`33440` - |Efficiency| :class:`linear_model.LogisticRegression` with `solver="lbfgs"` now estimates the gradient of the loss at `float32` precision when fitted with `float32` data (`X`) to improve training speed and memory efficiency. Previously, the input data would be implicitly cast to `float64`. If you relied on the previous behavior for numerical reasons, you can explicitly cast your data to `float64` before fitting to reproduce it. By :user:`Omar Salman ` and :user:`Olivier Grisel `. :pr:`32644` - |Efficiency| The :class:`linear_model.LinearRegression`, :class:`linear_model.Ridge`, :class:`linear_model.Lasso`, :class:`linear_model.LassoCV`, :class:`linear_model.ElasticNet`, :class:`linear_model.ElasticNetCV` and :class:`linear_model.BayesianRidge` classes now no longer make an unnecessary copy of dense `X, y` input during preprocessing when `copy_X=False` and `sample_weight` is provided. By :user:`Junteng Li `. :pr:`33041` - |Enhancement| :class:`linear_model.ElasticNet`, :class:`linear_model.ElasticNetCV` and :func:`linear_model.enet_path` now are able to fit Ridge regression, i.e. setting `l1_ratio=0`. Before this PR, the stopping criterion was a formulation of the dual gap that breaks down for `l1_ratio=0`. Now, an alternative dual gap formulation is used for this setting. This reduces the noise of raised warnings. By :user:`Christian Lorentzen `. :pr:`32845` - |Enhancement| |Efficiency| :class:`linear_model.ElasticNet`, :class:`linear_model.ElasticNetCV`, :class:`linear_model.Lasso`, :class:`linear_model.LassoCV`, :class:`linear_model.MultiTaskElasticNet`, :class:`linear_model.MultiTaskElasticNetCV` :class:`linear_model.MultiTaskLasso`, :class:`linear_model.MultiTaskLassoCV` as well as :func:`linear_model.lasso_path` and :func:`linear_model.enet_path` are now faster when fit with strong L1 penalty and many features. During gap safe screening of features, the update of the residual is now only performed if the coefficient is not zero. By :user:`Christian Lorentzen `. :pr:`33161` - |Fix| :class:`linear_model.LassoCV` and :class:`linear_model.ElasticNetCV` now take the `positive` parameter into account to compute the maximum `alpha` parameter, where all coefficients are zero. This impacts the search grid for the internally tuned `alpha` hyper-parameter stored in the attribute `alphas_`. By :user:`Junteng Li ` :pr:`32768` - |Fix| Correct the formulation of `alpha` within :class:`linear_model.SGDOneClassSVM`. The corrected value is `alpha = nu` instead of `alpha = nu / 2`. Note: This might result in changed values for the fitted attributes like `coef_` and `offset_` as well as the predictions made using this class. By :user:`Omar Salman `. :pr:`32778` - |Fix| :class:`linear_model.LogisticRegressionCV` now correctly handles the case when the `scoring` parameter is set (to something not `None`) and when the CV splits result in folds where some class labels are missing. By :user:`Christian Lorentzen `. :pr:`32828` - |Fix| :func:`linear_model.enet_path` now correctly handles the ``precompute`` parameter when ``check_input=False``. Previously, the value of ``precompute`` was not properly treated which could lead to a ValueError. This also affects :class:`linear_model.ElasticNetCV`, :class:`linear_model.LassoCV`, :class:`linear_model.MultiTaskElasticNetCV` and :class:`linear_model.MultiTaskLassoCV`. By :user:`Albert Dorador ` :pr:`33014` - |Fix| The leave-one out errors and model parameters estimated in :class:`linear_model.RidgeCV` and :class:`linear_model.RidgeClassifierCV` when `cv=None` are now numerically stable in the small `alpha` regime. The default `auto` option is now equivalent to `eigen` and picks the cheaper option: eigendecomposition of the covariance matrix when `n_features <= n_samples`, respectively of the Gram matrix when `n_samples > n_features`. When `store_cv_results=True` and `X` is an integer array, the `cv_results_` attribute was wrongly coerced to the integer dtype of `X`, it now always has a float dtype. By :user:`Antoine Baker ` :pr:`33020` - |Fix| Fixed a bug in :class:`linear_model.SGDClassifier` for multiclass settings where large negative values of :method:`decision_function` could lead to NaN values. In this case, this fix assigns equal probability for each class. By :user:`Christian Lorentzen `. :pr:`33168` - |Fix| The `tol` parameter in :class:`linear_model.LinearRegression` is now set as the `cond` parameter of the :func:`scipy.linalg.lstsq` solver when fitting on dense data. Some tests involving `LinearRegression` were brittle with the default `cond` values from `scipy` or `numpy`. Here at least the user has control over the `cond` value and can change it if necessary. By :user:`Antoine Baker ` :pr:`33565` - |API| The default value of the `scoring` parameter in :class:`linear_model.LogisticRegressionCV` will change in version 1.11 from `None`, i.e. accuracy, to `"neg_log_loss"`. This is a much better default scoring function as it aligns with the log loss that logistic regression is minimizing (with regularization). For the meantime, you can silence the warning for this change by explicitly passing a value to `scoring`. By :user:`Christian Lorentzen `. :pr:`33333` :mod:`sklearn.manifold` ----------------------- - |Fix| :meth:`manifold.MDS.transform` returns the correct number of components when using `init="classical_mds"`. By :user:`Ben Pedigo `. :pr:`33318` :mod:`sklearn.metrics` ---------------------- - |MajorFeature| :func:`metrics.metric_at_thresholds` has been added to compute a metric's values across all possible thresholds. By :user:`Carlo Lemos ` and :user:`Lucy Liu `. :pr:`32732` - |Feature| Add class method `from_cv_results` to :class:`metrics.PrecisionRecallDisplay`, which allows easy plotting of multiple precision-recall curves from :func:`model_selection.cross_validate` results. By :user:`Lucy Liu `. :pr:`30508` - |Enhancement| :func:`~metrics.cohen_kappa_score` now has a `replace_undefined_by` param, that can be set to define the function's return value when the metric is undefined (division by zero). By :user:`Stefanie Senger ` :pr:`31172` - |Fix| :func:`metrics.d2_pinball_score` and :func:`metrics.d2_absolute_error_score` now always use the `"averaged_inverted_cdf"` quantile method, both with and without sample weights. Previously, the `"linear"` quantile method was used only for the unweighted case leading the surprising discrepancies when comparing the results with unit weights. Note that all quantile interpolation methods are asymptotically equivalent in the large sample limit, but this fix can cause score value changes on small evaluation sets (without weights). By :user:`Virgil Chan `. :pr:`31671` - |Fix| :meth:`metrics.PrecisionRecallDisplay.from_estimator` and :meth:`metrics.PrecisionRecallDisplay.from_predictions` now correctly plot chance level line when `y_true` is a pytorch tensor. By :user:`Lucas Oliveira `. :pr:`33405` - |Fix| `y_pred` was deprecated in favor of `y_proba` for :func:`metrics.log_loss` and :func:`metrics.d2_log_loss_score` as predicted probabilities are expected, not predicted labels. By :user:`Lucy Liu `. :pr:`33740` - |API| Passing the `pos_label` and `sample_weight` parameters of :func:`metrics.confusion_matrix_at_thresholds` as positional arguments is deprecated and will be removed in v1.11. By :user:`Jérémie du Boisberranger `. :pr:`33357` :mod:`sklearn.model_selection` ------------------------------ - |Enhancement| :class:`~sklearn.model_selection.GroupKFold` now uses `stable` sorting when doing the group distribution. This ensures that the splits are consistent across runs. By :user:`marikabergengren ` and `Adrin Jalali`_ :pr:`28464` - |Fix| :class:`model_selection.StratifiedGroupKFold` now raises a `ValueError` when `n_splits` is greater than the number of unique groups, preventing degenerate folds. By :user:`Chani Fainendler `. :pr:`33176` - |Fix| Fixed incorrect :class:`ValueError` when using ``scoring="average_precision"`` or similar in model selection utilities such as `model_selection.GridSearchCV` or `model_selection.cross_validate` with multiclass classifiers. The ``pos_label`` parameter is only relevant for binary classification and was incorrectly being validated for scorers used on multiclass problems. By :user:`Olivier Grisel `. :pr:`33473` :mod:`sklearn.neural_network` ----------------------------- - |Fix| :class:`neural_network.MLPClassifier` with ``early_stopping=True`` no longer raises a `TypeError` when ``y`` contains non-numeric class labels (e.g. strings): validation scoring now checks finiteness only for floating predictions. By :user:`Guillaume Lemaitre `. :pr:`33774` :mod:`sklearn.pipeline` ----------------------- - |Fix| Fixed :class:`pipeline.FeatureUnion` to properly handle column renaming when using Polars output, preventing duplicate column names. By :user:`Levente Csibi `. :pr:`32853` :pr:`32853` - |Fix| :class:`pipeline.Pipeline` now raises an `AttributeError` when accessing attributes that are not available on an empty pipeline. It's therefore possible to call `dir` on an empty pipeline. By :user:`Jérémie du Boisberranger `. :pr:`33362` :mod:`sklearn.preprocessing` ---------------------------- - |Fix| :class:`~sklearn.preprocessing.PowerTransformer` and :class:`~sklearn.preprocessing.QuantileTransformer` now don't raise a warning in :meth:`inverse_transform` related to feature names if :meth:`fit` is called using data with feature names. By :user:`Thibault ` and :user:`Mohammad Ahmadullah Khan `. :pr:`33268` - |API| The `shuffle` and the `random_state` parameters are deprecated on :class:`~preprocessing.TargetEncoder` and will be removed in version 1.11. Pass a cross-validation generator as `cv` argument to specify the shuffling behaviour instead. By :user:`Stefanie Senger `. :pr:`33453` :mod:`sklearn.svm` ------------------ - |Fix| Raise more informative error when fitting :class:`NuSVR` with all zero sample weights. By :user:`Lucy Liu ` and :user:`John Hendricks `. :pr:`32212` - |API| The `probability` parameter of :class:`sklearn.svm.SVC` and :class:`sklearn.svm.NuSVC` is deprecated due to not being thread-safe and will be removed in 1.11. Use :class:`sklearn.calibration.CalibratedClassifierCV` with the respective estimator and `ensemble=False` instead. By :user:`Shruti Nath `. :pr:`32050` - |API| The `probA_` and `probB_` attributes of :class:`sklearn.svm.SVC` and :class:`sklearn.svm.NuSVC` are deprecated due to deprecation of the `probability` parameter and will be removed in 1.11. By :user:`Shruti Nath `. :pr:`33388` :mod:`sklearn.tree` ------------------- - |Feature| In :class:`tree.DecisionTreeRegressor` and :class:`ensemble.RandomForestRegressor`, `criterion="absolute_error"` — and, consequently, all criterion options — now support missing values for dense training data `X`. By :user:`Arthur Lacote ` :pr:`32119` - |Fix| Fix calculation of node impurity in :class:`tree.DecisionTreeRegressor`, :class:`ensemble.RandomForestRegressor`, :class:`ExtraTreeRegressor` and :class:`ExtraTreesRegressor` when missing values are present for the Poisson criterion. The Poisson criterion was returning invalid impurities (including negative values) when missing values were present. By :user:`Arthur Lacote ` :pr:`32119` - |Fix| Fixed feature-wise NaN detection in trees. Features could be seen as NaN-free for some edge-case patterns, which led to not considering splits with NaNs assigned to the left node for those features. This affects: - :class:`tree.DecisionTreeRegressor` - :class:`tree.ExtraTreeRegressor` - :class:`ensemble.RandomForestRegressor` - :class:`ensemble.ExtraTreesRegressor` By :user:`Arthur Lacote ` :pr:`32193` - |API| `criterion="friedman_mse"` is now deprecated. This criterion was intended for gradient boosting but was incorrectly implemented in scikit-learn's trees and was actually behaving identically to `criterion="squared_error"`. Use `criterion="squared_error"` instead. This affects: - :class:`tree.DecisionTreeRegressor` - :class:`tree.ExtraTreeRegressor` - :class:`ensemble.RandomForestRegressor` - :class:`ensemble.ExtraTreesRegressor` By :user:`Arthur Lacote ` :pr:`32708` :mod:`sklearn.utils` -------------------- - |Enhancement| ``sklearn.utils._tags.get_tags`` now provides a clearer error message when a class is passed instead of an estimator instance. By :user:`Achyuthan S ` and :user:`Anne Beyer `. :pr:`32565` - |Enhancement| ``sklearn.utils._response._get_response_values`` now provides a clearer error message when estimator does not implement the given ``response_method``. By :user:`Quentin Barthélemy `. :pr:`33126` - |Fix| The parameter table in the HTML representation of all scikit-learn estimators inheritiging from :class:`base.BaseEstimator`, displays each parameter documentation as a tooltip. The last tooltip of a parameter in the last table of any HTML representation was partially hidden. This issue has been fixed. By :user:`Dea María Léon ` :pr:`32887` - |Fix| Fixed ``_weighted_percentile`` with ``average=True`` so zero-weight samples just before the end of the array are handled correctly. This can change results when using ``sample_weight`` with :class:`preprocessing.KBinsDiscretizer` (``strategy="quantile"``, ``quantile_method="averaged_inverted_cdf"``) and in :func:`metrics.median_absolute_error`, :func:`metrics.d2_pinball_score`, and :func:`metrics.d2_absolute_error_score`. By :user:`Arthur Lacote `. :pr:`33127` .. rubric:: Code and documentation contributors Thanks to everyone who has contributed to the maintenance and improvement of the project since version 1.8, including: TODO: update at the time of the release.