Version 0.17#

February 18, 2016


Bug fixes#

  • Upgrade vendored joblib to version 0.9.4 that fixes an important bug in joblib.Parallel that can silently yield to wrong results when working on datasets larger than 1MB: joblib/joblib

  • Fixed reading of Bunch pickles generated with scikit-learn version <= 0.16. This can affect users who have already downloaded a dataset with scikit-learn 0.16 and are loading it with scikit-learn 0.17. See #6196 for how this affected datasets.fetch_20newsgroups. By Loic Esteve.

  • Fixed a bug that prevented using ROC AUC score to perform grid search on several CPU / cores on large arrays. See #6147 By Olivier Grisel.

  • Fixed a bug that prevented to properly set the presort parameter in ensemble.GradientBoostingRegressor. See #5857 By Andrew McCulloh.

  • Fixed a joblib error when evaluating the perplexity of a decomposition.LatentDirichletAllocation model. See #6258 By Chyi-Kwei Yau.

November 5, 2015


New features#


Bug fixes#

API changes summary#

  • Attribute data_min, data_max and data_range in preprocessing.MinMaxScaler are deprecated and won’t be available from 0.19. Instead, the class now exposes data_min_, data_max_ and data_range_. By Giorgio Patrini.

  • All Scaler classes now have an scale_ attribute, the feature-wise rescaling applied by their transform methods. The old attribute std_ in preprocessing.StandardScaler is deprecated and superseded by scale_; it won’t be available in 0.19. By Giorgio Patrini.

  • svm.SVC and svm.NuSVC now have an decision_function_shape parameter to make their decision function of shape (n_samples, n_classes) by setting decision_function_shape='ovr'. This will be the default behavior starting in 0.19. By Andreas Müller.

  • Passing 1D data arrays as input to estimators is now deprecated as it caused confusion in how the array elements should be interpreted as features or as samples. All data arrays are now expected to be explicitly shaped (n_samples, n_features). By Vighnesh Birodkar.

  • lda.LDA and qda.QDA have been moved to discriminant_analysis.LinearDiscriminantAnalysis and discriminant_analysis.QuadraticDiscriminantAnalysis.

  • The store_covariance and tol parameters have been moved from the fit method to the constructor in discriminant_analysis.LinearDiscriminantAnalysis and the store_covariances and tol parameters have been moved from the fit method to the constructor in discriminant_analysis.QuadraticDiscriminantAnalysis.

  • Models inheriting from _LearntSelectorMixin will no longer support the transform methods. (i.e, RandomForests, GradientBoosting, LogisticRegression, DecisionTrees, SVMs and SGD related models). Wrap these models around the metatransfomer feature_selection.SelectFromModel to remove features (according to coefs_ or feature_importances_) which are below a certain threshold value instead.

  • cluster.KMeans re-runs cluster-assignments in case of non-convergence, to ensure consistency of predict(X) and labels_. By Vighnesh Birodkar.

  • Classifier and Regressor models are now tagged as such using the _estimator_type attribute.

  • Cross-validation iterators always provide indices into training and test set, not boolean masks.

  • The decision_function on all regressors was deprecated and will be removed in 0.19. Use predict instead.

  • datasets.load_lfw_pairs is deprecated and will be removed in 0.19. Use datasets.fetch_lfw_pairs instead.

  • The deprecated hmm module was removed.

  • The deprecated Bootstrap cross-validation iterator was removed.

  • The deprecated Ward and WardAgglomerative classes have been removed. Use cluster.AgglomerativeClustering instead.

  • cross_validation.check_cv is now a public function.

  • The property residues_ of linear_model.LinearRegression is deprecated and will be removed in 0.19.

  • The deprecated n_jobs parameter of linear_model.LinearRegression has been moved to the constructor.

  • Removed deprecated class_weight parameter from linear_model.SGDClassifier’s fit method. Use the construction parameter instead.

  • The deprecated support for the sequence of sequences (or list of lists) multilabel format was removed. To convert to and from the supported binary indicator matrix format, use MultiLabelBinarizer.

  • The behavior of calling the inverse_transform method of Pipeline.pipeline will change in 0.19. It will no longer reshape one-dimensional input to two-dimensional input.

  • The deprecated attributes indicator_matrix_, multilabel_ and classes_ of preprocessing.LabelBinarizer were removed.

  • Using gamma=0 in svm.SVC and svm.SVR to automatically set the gamma to 1. / n_features is deprecated and will be removed in 0.19. Use gamma="auto" instead.

