# sklearn.mixture.DPGMM¶

Warning

DEPRECATED

class sklearn.mixture.DPGMM(*args, **kwargs)[source]

Dirichlet Process Gaussian Mixture Models

Deprecated since version 0.18: This class will be removed in 0.20. Use sklearn.mixture.BayesianGaussianMixture with parameter weight_concentration_prior_type='dirichlet_process' instead.

Methods

 aic(X) Akaike information criterion for the current model fit and the proposed data. bic(X) Bayesian information criterion for the current model fit and the proposed data. fit(X[, y]) Estimate model parameters with the EM algorithm. fit_predict(X[, y]) Fit and then predict labels for data. get_params([deep]) Get parameters for this estimator. lower_bound(X, z) returns a lower bound on model evidence based on X and membership predict(X) Predict label for data. predict_proba(X) Predict posterior probability of data under each Gaussian in the model. sample([n_samples, random_state]) Generate random samples from the model. score(X[, y]) Compute the log probability under the model. score_samples(X) Return the likelihood of the data under the model. set_params(**params) Set the parameters of this estimator.
__init__(*args, **kwargs)[source]

DEPRECATED: The DPGMM class is not working correctly and it’s better to use sklearn.mixture.BayesianGaussianMixture class with parameter weight_concentration_prior_type=’dirichlet_process’ instead. DPGMM is deprecated in 0.18 and will be removed in 0.20.

aic(X)[source]

Akaike information criterion for the current model fit and the proposed data.

Parameters: X : array of shape(n_samples, n_dimensions) aic : float (the lower the better)
bic(X)[source]

Bayesian information criterion for the current model fit and the proposed data.

Parameters: X : array of shape(n_samples, n_dimensions) bic : float (the lower the better)
fit(X, y=None)[source]

Estimate model parameters with the EM algorithm.

A initialization step is performed before entering the expectation-maximization (EM) algorithm. If you want to avoid this step, set the keyword argument init_params to the empty string ‘’ when creating the GMM object. Likewise, if you would like just to do an initialization, set n_iter=0.

Parameters: X : array_like, shape (n, n_features) List of n_features-dimensional data points. Each row corresponds to a single data point. self
fit_predict(X, y=None)[source]

Fit and then predict labels for data.

Warning: Due to the final maximization step in the EM algorithm, with low iterations the prediction may not be 100% accurate.

New in version 0.17: fit_predict method in Gaussian Mixture Model.

Parameters: X : array-like, shape = [n_samples, n_features] C : array, shape = (n_samples,) component memberships
get_params(deep=True)[source]

Get parameters for this estimator.

Parameters: deep : boolean, optional If True, will return the parameters for this estimator and contained subobjects that are estimators. params : mapping of string to any Parameter names mapped to their values.
lower_bound(X, z)[source]

returns a lower bound on model evidence based on X and membership

predict(X)[source]

Predict label for data.

Parameters: X : array-like, shape = [n_samples, n_features] C : array, shape = (n_samples,) component memberships
predict_proba(X)[source]

Predict posterior probability of data under each Gaussian in the model.

Parameters: X : array-like, shape = [n_samples, n_features] responsibilities : array-like, shape = (n_samples, n_components) Returns the probability of the sample for each Gaussian (state) in the model.
sample(n_samples=1, random_state=None)[source]

Generate random samples from the model.

Parameters: n_samples : int, optional Number of samples to generate. Defaults to 1. X : array_like, shape (n_samples, n_features) List of samples
score(X, y=None)[source]

Compute the log probability under the model.

Parameters: X : array_like, shape (n_samples, n_features) List of n_features-dimensional data points. Each row corresponds to a single data point. logprob : array_like, shape (n_samples,) Log probabilities of each data point in X
score_samples(X)[source]

Return the likelihood of the data under the model.

Compute the bound on log probability of X under the model and return the posterior distribution (responsibilities) of each mixture component for each element of X.

This is done by computing the parameters for the mean-field of z for each observation.

Parameters: X : array_like, shape (n_samples, n_features) List of n_features-dimensional data points. Each row corresponds to a single data point. logprob : array_like, shape (n_samples,) Log probabilities of each data point in X responsibilities : array_like, shape (n_samples, n_components) Posterior probabilities of each mixture component for each observation
set_params(**params)[source]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns: self