# Version 0.13¶

## Version 0.13.1¶

**February 23, 2013**

The 0.13.1 release only fixes some bugs and does not add any new functionality.

### Changelog¶

Fixed a testing error caused by the function

`cross_validation.train_test_split`

being interpreted as a test by Yaroslav Halchenko.Fixed a bug in the reassignment of small clusters in the

`cluster.MiniBatchKMeans`

by Gael Varoquaux.Fixed default value of

`gamma`

in`decomposition.KernelPCA`

by Lars Buitinck.Updated joblib to

`0.7.0d`

by Gael Varoquaux.Fixed scaling of the deviance in

`ensemble.GradientBoostingClassifier`

by Peter Prettenhofer.Better tie-breaking in

`multiclass.OneVsOneClassifier`

by Andreas Müller.Other small improvements to tests and documentation.

### People¶

List of contributors for release 0.13.1 by number of commits.

5 Robert Marchman

2 Hrishikesh Huilgolkar

1 Bastiaan van den Berg

1 Diego Molla

1 Rafael Cunha de Almeida

1 Rolando Espinoza La fuente

## Version 0.13¶

**January 21, 2013**

### New Estimator Classes¶

`dummy.DummyClassifier`

and`dummy.DummyRegressor`

, two data-independent predictors by Mathieu Blondel. Useful to sanity-check your estimators. See Dummy estimators in the user guide. Multioutput support added by Arnaud Joly.`decomposition.FactorAnalysis`

, a transformer implementing the classical factor analysis, by Christian Osendorfer and Alexandre Gramfort. See Factor Analysis in the user guide.`feature_extraction.FeatureHasher`

, a transformer implementing the “hashing trick” for fast, low-memory feature extraction from string fields by Lars Buitinck and`feature_extraction.text.HashingVectorizer`

for text documents by Olivier Grisel See Feature hashing and Vectorizing a large text corpus with the hashing trick for the documentation and sample usage.`pipeline.FeatureUnion`

, a transformer that concatenates results of several other transformers by Andreas Müller. See FeatureUnion: composite feature spaces in the user guide.`random_projection.GaussianRandomProjection`

,`random_projection.SparseRandomProjection`

and the function`random_projection.johnson_lindenstrauss_min_dim`

. The first two are transformers implementing Gaussian and sparse random projection matrix by Olivier Grisel and Arnaud Joly. See Random Projection in the user guide.`kernel_approximation.Nystroem`

, a transformer for approximating arbitrary kernels by Andreas Müller. See Nystroem Method for Kernel Approximation in the user guide.`preprocessing.OneHotEncoder`

, a transformer that computes binary encodings of categorical features by Andreas Müller. See Encoding categorical features in the user guide.`linear_model.PassiveAggressiveClassifier`

and`linear_model.PassiveAggressiveRegressor`

, predictors implementing an efficient stochastic optimization for linear models by Rob Zinkov and Mathieu Blondel. See Passive Aggressive Algorithms in the user guide.`ensemble.RandomTreesEmbedding`

, a transformer for creating high-dimensional sparse representations using ensembles of totally random trees by Andreas Müller. See Totally Random Trees Embedding in the user guide.`manifold.SpectralEmbedding`

and function`manifold.spectral_embedding`

, implementing the “laplacian eigenmaps” transformation for non-linear dimensionality reduction by Wei Li. See Spectral Embedding in the user guide.`isotonic.IsotonicRegression`

by Fabian Pedregosa, Alexandre Gramfort and Nelle Varoquaux,

### Changelog¶

`metrics.zero_one_loss`

(formerly`metrics.zero_one`

) now has option for normalized output that reports the fraction of misclassifications, rather than the raw number of misclassifications. By Kyle Beauchamp.`tree.DecisionTreeClassifier`

and all derived ensemble models now support sample weighting, by Noel Dawe and Gilles Louppe.Speedup improvement when using bootstrap samples in forests of randomized trees, by Peter Prettenhofer and Gilles Louppe.

Partial dependence plots for Gradient-boosted trees in

`ensemble.partial_dependence.partial_dependence`

by Peter Prettenhofer. See Partial Dependence and Individual Conditional Expectation Plots for an example.The table of contents on the website has now been made expandable by Jaques Grobler.

`feature_selection.SelectPercentile`

now breaks ties deterministically instead of returning all equally ranked features.`feature_selection.SelectKBest`

and`feature_selection.SelectPercentile`

are more numerically stable since they use scores, rather than p-values, to rank results. This means that they might sometimes select different features than they did previously.Ridge regression and ridge classification fitting with

`sparse_cg`

solver no longer has quadratic memory complexity, by Lars Buitinck and Fabian Pedregosa.Ridge regression and ridge classification now support a new fast solver called

`lsqr`

, by Mathieu Blondel.Speed up of

`metrics.precision_recall_curve`

by Conrad Lee.Added support for reading/writing svmlight files with pairwise preference attribute (qid in svmlight file format) in

`datasets.dump_svmlight_file`

and`datasets.load_svmlight_file`

by Fabian Pedregosa.Faster and more robust

`metrics.confusion_matrix`

and Clustering performance evaluation by Wei Li.`cross_validation.cross_val_score`

now works with precomputed kernels and affinity matrices, by Andreas Müller.LARS algorithm made more numerically stable with heuristics to drop regressors too correlated as well as to stop the path when numerical noise becomes predominant, by Gael Varoquaux.

Faster implementation of

`metrics.precision_recall_curve`

by Conrad Lee.New kernel

`metrics.chi2_kernel`

by Andreas Müller, often used in computer vision applications.Fix of longstanding bug in

`naive_bayes.BernoulliNB`

fixed by Shaun Jackman.Implemented

`predict_proba`

in`multiclass.OneVsRestClassifier`

, by Andrew Winterman.Improve consistency in gradient boosting: estimators

`ensemble.GradientBoostingRegressor`

and`ensemble.GradientBoostingClassifier`

use the estimator`tree.DecisionTreeRegressor`

instead of the`tree._tree.Tree`

data structure by Arnaud Joly.Fixed a floating point exception in the decision trees module, by Seberg.

Fix

`metrics.roc_curve`

fails when y_true has only one class by Wei Li.Add the

`metrics.mean_absolute_error`

function which computes the mean absolute error. The`metrics.mean_squared_error`

,`metrics.mean_absolute_error`

and`metrics.r2_score`

metrics support multioutput by Arnaud Joly.Fixed

`class_weight`

support in`svm.LinearSVC`

and`linear_model.LogisticRegression`

by Andreas Müller. The meaning of`class_weight`

was reversed as erroneously higher weight meant less positives of a given class in earlier releases.Improve narrative documentation and consistency in

`sklearn.metrics`

for regression and classification metrics by Arnaud Joly.Fixed a bug in

`sklearn.svm.SVC`

when using csr-matrices with unsorted indices by Xinfan Meng and Andreas Müller.`cluster.MiniBatchKMeans`

: Add random reassignment of cluster centers with little observations attached to them, by Gael Varoquaux.

### API changes summary¶

Renamed all occurrences of

`n_atoms`

to`n_components`

for consistency. This applies to`decomposition.DictionaryLearning`

,`decomposition.MiniBatchDictionaryLearning`

,`decomposition.dict_learning`

,`decomposition.dict_learning_online`

.Renamed all occurrences of

`max_iters`

to`max_iter`

for consistency. This applies to`semi_supervised.LabelPropagation`

and`semi_supervised.label_propagation.LabelSpreading`

.Renamed all occurrences of

`learn_rate`

to`learning_rate`

for consistency in`ensemble.BaseGradientBoosting`

and`ensemble.GradientBoostingRegressor`

.The module

`sklearn.linear_model.sparse`

is gone. Sparse matrix support was already integrated into the “regular” linear models.`sklearn.metrics.mean_square_error`

, which incorrectly returned the accumulated error, was removed. Use`metrics.mean_squared_error`

instead.Passing

`class_weight`

parameters to`fit`

methods is no longer supported. Pass them to estimator constructors instead.GMMs no longer have

`decode`

and`rvs`

methods. Use the`score`

,`predict`

or`sample`

methods instead.The

`solver`

fit option in Ridge regression and classification is now deprecated and will be removed in v0.14. Use the constructor option instead.`feature_extraction.text.DictVectorizer`

now returns sparse matrices in the CSR format, instead of COO.Renamed

`k`

in`cross_validation.KFold`

and`cross_validation.StratifiedKFold`

to`n_folds`

, renamed`n_bootstraps`

to`n_iter`

in`cross_validation.Bootstrap`

.Renamed all occurrences of

`n_iterations`

to`n_iter`

for consistency. This applies to`cross_validation.ShuffleSplit`

,`cross_validation.StratifiedShuffleSplit`

,`utils.extmath.randomized_range_finder`

and`utils.extmath.randomized_svd`

.Replaced

`rho`

in`linear_model.ElasticNet`

and`linear_model.SGDClassifier`

by`l1_ratio`

. The`rho`

parameter had different meanings;`l1_ratio`

was introduced to avoid confusion. It has the same meaning as previously`rho`

in`linear_model.ElasticNet`

and`(1-rho)`

in`linear_model.SGDClassifier`

.`linear_model.LassoLars`

and`linear_model.Lars`

now store a list of paths in the case of multiple targets, rather than an array of paths.The attribute

`gmm`

of`hmm.GMMHMM`

was renamed to`gmm_`

to adhere more strictly with the API.`cluster.spectral_embedding`

was moved to`manifold.spectral_embedding`

.Renamed

`eig_tol`

in`manifold.spectral_embedding`

,`cluster.SpectralClustering`

to`eigen_tol`

, renamed`mode`

to`eigen_solver`

.Renamed

`mode`

in`manifold.spectral_embedding`

and`cluster.SpectralClustering`

to`eigen_solver`

.`classes_`

and`n_classes_`

attributes of`tree.DecisionTreeClassifier`

and all derived ensemble models are now flat in case of single output problems and nested in case of multi-output problems.The

`estimators_`

attribute of`ensemble.GradientBoostingRegressor`

and`ensemble.GradientBoostingClassifier`

is now an array of`tree.DecisionTreeRegressor`

.Renamed

`chunk_size`

to`batch_size`

in`decomposition.MiniBatchDictionaryLearning`

and`decomposition.MiniBatchSparsePCA`

for consistency.`svm.SVC`

and`svm.NuSVC`

now provide a`classes_`

attribute and support arbitrary dtypes for labels`y`

. Also, the dtype returned by`predict`

now reflects the dtype of`y`

during`fit`

(used to be`np.float`

).Changed default test_size in

`cross_validation.train_test_split`

to None, added possibility to infer`test_size`

from`train_size`

in`cross_validation.ShuffleSplit`

and`cross_validation.StratifiedShuffleSplit`

.Renamed function

`sklearn.metrics.zero_one`

to`sklearn.metrics.zero_one_loss`

. Be aware that the default behavior in`sklearn.metrics.zero_one_loss`

is different from`sklearn.metrics.zero_one`

:`normalize=False`

is changed to`normalize=True`

.Renamed function

`metrics.zero_one_score`

to`metrics.accuracy_score`

.`datasets.make_circles`

now has the same number of inner and outer points.In the Naive Bayes classifiers, the

`class_prior`

parameter was moved from`fit`

to`__init__`

.

### People¶

List of contributors for release 0.13 by number of commits.

364 Andreas Müller

143 Arnaud Joly

131 Gael Varoquaux

117 Mathieu Blondel

108 Lars Buitinck

106 Wei Li

101 Olivier Grisel

65 Vlad Niculae

30 Rob Zinkov

19 Aymeric Masurelle

18 Andrew Winterman

17 Nelle Varoquaux

14 Daniel Nouri

13 syhw

10 Corey Lynch

10 Kyle Beauchamp

9 Brian Cheung

9 Immanuel Bayer

9 mr.Shu

8 Conrad Lee

7 Tadej Janež

6 Brian Cajes

6 Michael

6 Noel Dawe

6 Tiago Nunes

6 cow

5 Anze

5 Shiqiao Du

4 Christian Jauvin

4 Jacques Kvam

4 Richard T. Guy

3 Alexandre Abraham

3 Doug Coleman

3 Scott Dickerson

2 ApproximateIdentity

2 John Benediktsson

2 Mark Veronda

2 Matti Lyra

2 Mikhail Korobov

2 Xinfan Meng

1 Alejandro Weinstein

1 Christoph Deil

1 Eugene Nizhibitsky

1 Kenneth C. Arnold

1 Luis Pedro Coelho

1 Miroslav Batchkarov

1 Pavel

1 Sebastian Berg

1 Shaun Jackman

1 Subhodeep Moitra

1 bob

1 dengemann

1 emanuele

1 x006