sklearn.linear_model.TweedieRegressor

class sklearn.linear_model.TweedieRegressor(*, power=0.0, alpha=1.0, fit_intercept=True, link='auto', max_iter=100, tol=0.0001, warm_start=False, verbose=0)[source]

Generalized Linear Model with a Tweedie distribution.

This estimator can be used to model different GLMs depending on the power parameter, which determines the underlying distribution.

Read more in the User Guide.

New in version 0.23.

Parameters
powerfloat, default=0

The power determines the underlying target distribution according to the following table:

Power

Distribution

0

Normal

1

Poisson

(1,2)

Compound Poisson Gamma

2

Gamma

3

Inverse Gaussian

For 0 < power < 1, no distribution exists.

alphafloat, default=1

Constant that multiplies the penalty term and thus determines the regularization strength. alpha = 0 is equivalent to unpenalized GLMs. In this case, the design matrix X must have full column rank (no collinearities).

fit_interceptbool, default=True

Specifies if a constant (a.k.a. bias or intercept) should be added to the linear predictor (X @ coef + intercept).

link{‘auto’, ‘identity’, ‘log’}, default=’auto’

The link function of the GLM, i.e. mapping from linear predictor X @ coeff + intercept to prediction y_pred. Option ‘auto’ sets the link depending on the chosen family as follows:

  • ‘identity’ for Normal distribution

  • ‘log’ for Poisson, Gamma and Inverse Gaussian distributions

max_iterint, default=100

The maximal number of iterations for the solver.

tolfloat, default=1e-4

Stopping criterion. For the lbfgs solver, the iteration will stop when max{|g_j|, j = 1, ..., d} <= tol where g_j is the j-th component of the gradient (derivative) of the objective function.

warm_startbool, default=False

If set to True, reuse the solution of the previous call to fit as initialization for coef_ and intercept_ .

verboseint, default=0

For the lbfgs solver set verbose to any positive number for verbosity.

Attributes
coef_array of shape (n_features,)

Estimated coefficients for the linear predictor (X @ coef_ + intercept_) in the GLM.

intercept_float

Intercept (a.k.a. bias) added to linear predictor.

n_iter_int

Actual number of iterations used in the solver.

n_features_in_int

Number of features seen during fit.

New in version 0.24.

feature_names_in_ndarray of shape (n_features_in_,)

Names of features seen during fit. Defined only when X has feature names that are all strings.

New in version 1.0.

See also

PoissonRegressor

Generalized Linear Model with a Poisson distribution.

GammaRegressor

Generalized Linear Model with a Gamma distribution.

Examples

>>> from sklearn import linear_model
>>> clf = linear_model.TweedieRegressor()
>>> X = [[1, 2], [2, 3], [3, 4], [4, 3]]
>>> y = [2, 3.5, 5, 5.5]
>>> clf.fit(X, y)
TweedieRegressor()
>>> clf.score(X, y)
0.839...
>>> clf.coef_
array([0.599..., 0.299...])
>>> clf.intercept_
1.600...
>>> clf.predict([[1, 1], [3, 4]])
array([2.500..., 4.599...])

Methods

fit(X, y[, sample_weight])

Fit a Generalized Linear Model.

get_params([deep])

Get parameters for this estimator.

predict(X)

Predict using GLM with feature matrix X.

score(X, y[, sample_weight])

Compute D^2, the percentage of deviance explained.

set_params(**params)

Set the parameters of this estimator.

property family

Return the family of the regressor.

fit(X, y, sample_weight=None)[source]

Fit a Generalized Linear Model.

Parameters
X{array-like, sparse matrix} of shape (n_samples, n_features)

Training data.

yarray-like of shape (n_samples,)

Target values.

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.

Returns
selfobject

Fitted model.

get_params(deep=True)[source]

Get parameters for this estimator.

Parameters
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
paramsdict

Parameter names mapped to their values.

predict(X)[source]

Predict using GLM with feature matrix X.

Parameters
X{array-like, sparse matrix} of shape (n_samples, n_features)

Samples.

Returns
y_predarray of shape (n_samples,)

Returns predicted values.

score(X, y, sample_weight=None)[source]

Compute D^2, the percentage of deviance explained.

D^2 is a generalization of the coefficient of determination R^2. R^2 uses squared error and D^2 deviance. Note that those two are equal for family='normal'.

D^2 is defined as \(D^2 = 1-\frac{D(y_{true},y_{pred})}{D_{null}}\), \(D_{null}\) is the null deviance, i.e. the deviance of a model with intercept alone, which corresponds to \(y_{pred} = \bar{y}\). The mean \(\bar{y}\) is averaged by sample_weight. Best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse).

Parameters
X{array-like, sparse matrix} of shape (n_samples, n_features)

Test samples.

yarray-like of shape (n_samples,)

True values of target.

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.

Returns
scorefloat

D^2 of self.predict(X) w.r.t. y.

set_params(**params)[source]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters
**paramsdict

Estimator parameters.

Returns
selfestimator instance

Estimator instance.

Examples using sklearn.linear_model.TweedieRegressor