sklearn.utils.sparsefuncs.mean_variance_axis

sklearn.utils.sparsefuncs.mean_variance_axis(X, axis, weights=None, return_sum_weights=False)[source]

Compute mean and variance along an axis on a CSR or CSC matrix.

Parameters:
Xsparse matrix of shape (n_samples, n_features)

Input data. It can be of CSR or CSC format.

axis{0, 1}

Axis along which the axis should be computed.

weightsndarray of shape (n_samples,) or (n_features,), default=None

If axis is set to 0 shape is (n_samples,) or if axis is set to 1 shape is (n_features,). If it is set to None, then samples are equally weighted.

New in version 0.24.

return_sum_weightsbool, default=False

If True, returns the sum of weights seen for each feature if axis=0 or each sample if axis=1.

New in version 0.24.

Returns:
meansndarray of shape (n_features,), dtype=floating

Feature-wise means.

variancesndarray of shape (n_features,), dtype=floating

Feature-wise variances.

sum_weightsndarray of shape (n_features,), dtype=floating

Returned if return_sum_weights is True.

Examples

>>> from sklearn.utils import sparsefuncs
>>> from scipy import sparse
>>> import numpy as np
>>> indptr = np.array([0, 3, 4, 4, 4])
>>> indices = np.array([0, 1, 2, 2])
>>> data = np.array([8, 1, 2, 5])
>>> scale = np.array([2, 3, 2])
>>> csr = sparse.csr_matrix((data, indices, indptr))
>>> csr.todense()
matrix([[8, 1, 2],
        [0, 0, 5],
        [0, 0, 0],
        [0, 0, 0]])
>>> sparsefuncs.mean_variance_axis(csr, axis=0)
(array([2.  , 0.25, 1.75]), array([12.    ,  0.1875,  4.1875]))