Version 1.2.0¶
In Development
Legend for changelogs¶
Major Feature : something big that you couldn’t do before.
Feature : something that you couldn’t do before.
Efficiency : an existing feature now may not require as much computation or memory.
Enhancement : a miscellaneous minor improvement.
Fix : something that previously didn’t work as documentated – or according to reasonable expectations – should now work.
API Change : you will need to change your code to have the same effect in the future; or a feature will be removed in the future.
Changed models¶
The following estimators and functions, when fit with the same data and parameters, may produce different models from the previous version. This often occurs due to changes in the modelling logic (bug fixes or enhancements), or in random sampling procedures.
Changelog¶
sklearn.cluster
¶
Enhancement The
predict
andfit_predict
methods ofcluster.OPTICS
now accept sparse data type for input data. #14736 by Hunt Zhan, #20802 by Brandon Pokorny, and #22965 by Meekail Zain.Enhancement
cluster.Birch
now preserves dtype fornumpy.float32
inputs. #22968 byMeekail Zain <micky774>
.
sklearn.datasets
¶
Enhancement Introduce the new parameter
parser
indatasets.fetch_openml
.parser="pandas"
allows to use the very CPU and memory efficientpandas.read_csv
parser to load dense ARFF formatted dataset files. It is possible to passparser="liac-arff"
to use the old LIAC parser. Whenparser="auto"
, dense datasets are loaded with “pandas” and sparse datasets are loaded with “liac-arff”. Currently,parser="liac-arff"
by default and will change toparser="auto"
in version 1.4 #21938 by Guillaume Lemaitre.
sklearn.ensemble
¶
Efficiency Improve runtime performance of
ensemble.IsolationForest
by avoiding data copies. #23252 by Zhehao Liu.
sklearn.metrics
¶
Feature
class_likelihood_ratios
is added to compute the positive and negative likelihood ratios derived from the confusion matrix of a binary classification problem. #22518 by Arturo Amor.
sklearn.neighbors
¶
Enhancement
neighbors.KernelDensity
bandwidth parameter now accepts definition using Scott’s and Silvermann’s estimation methods. #10468 by Ruben and #22993 by Jovan Stojanovic.Feature Adds new function
neighbors.sort_graph_by_row_values
to sort a CSR sparse graph such that each row is stored with increasing values. This is useful to improve efficiency when using precomputed sparse distance matrices in a variety of estimators and avoid anEfficiencyWarning
. #23139 by Tom Dupre la Tour.
sklearn.tree
¶
Fix Fixed invalid memory access bug during fit in
tree.DecisionTreeRegressor
andtree.DecisionTreeClassifier
. #23273 by Thomas Fan.
sklearn.utils
¶
Enhancement
utils.extmath.randomized_svd
now accepts an argument,lapack_svd_driver
, to specify the lapack driver used in the internal deterministic SVD used by the randomized SVD algorithm. #20617 by Srinath Kailasa
Code and Documentation Contributors¶
Thanks to everyone who has contributed to the maintenance and improvement of the project since version 1.1, including:
TODO: update at the time of the release.