`sklearn.linear_model`.TweedieRegressor¶

class sklearn.linear_model.TweedieRegressor(*, power=0.0, alpha=1.0, fit_intercept=True, link='auto', max_iter=100, tol=0.0001, warm_start=False, verbose=0)[source]¶

Generalized Linear Model with a Tweedie distribution.

This estimator can be used to model different GLMs depending on the power parameter, which determines the underlying distribution.

Read more in the User Guide.

New in version 0.23.

Parameters

powerfloat, default=0

The power determines the underlying target distribution according to the following table:

Power	Distribution
0	Normal
1	Poisson
(1,2)	Compound Poisson Gamma
2	Gamma
3	Inverse Gaussian

For 0 < power < 1, no distribution exists.

alphafloat, default=1

Constant that multiplies the penalty term and thus determines the regularization strength. alpha = 0 is equivalent to unpenalized GLMs. In this case, the design matrix X must have full column rank (no collinearities).

fit_interceptbool, default=True

Specifies if a constant (a.k.a. bias or intercept) should be added to the linear predictor (X @ coef + intercept).

link{‘auto’, ‘identity’, ‘log’}, default=’auto’

The link function of the GLM, i.e. mapping from linear predictor X @ coeff + intercept to prediction y_pred. Option ‘auto’ sets the link depending on the chosen family as follows:

‘identity’ for Normal distribution
‘log’ for Poisson, Gamma and Inverse Gaussian distributions

max_iterint, default=100

The maximal number of iterations for the solver.

tolfloat, default=1e-4

Stopping criterion. For the lbfgs solver, the iteration will stop when max{|g_j|, j = 1, ..., d} <= tol where g_j is the j-th component of the gradient (derivative) of the objective function.

warm_startbool, default=False

If set to True, reuse the solution of the previous call to fit as initialization for coef_ and intercept_ .

verboseint, default=0

For the lbfgs solver set verbose to any positive number for verbosity.

Attributes

coef_array of shape (n_features,): Estimated coefficients for the linear predictor (X @ coef_ + intercept_) in the GLM.
intercept_float: Intercept (a.k.a. bias) added to linear predictor.
n_iter_int: Actual number of iterations used in the solver.
n_features_in_int: Number of features seen during fit.

New in version 0.24.
feature_names_in_ndarray of shape (n_features_in_,): Names of features seen during fit. Defined only when X has feature names that are all strings.

New in version 1.0.

See also

PoissonRegressor: Generalized Linear Model with a Poisson distribution.
GammaRegressor: Generalized Linear Model with a Gamma distribution.

Examples

>>> from sklearn import linear_model
>>> clf = linear_model.TweedieRegressor()
>>> X = [[1, 2], [2, 3], [3, 4], [4, 3]]
>>> y = [2, 3.5, 5, 5.5]
>>> clf.fit(X, y)
TweedieRegressor()
>>> clf.score(X, y)
0.839...
>>> clf.coef_
array([0.599..., 0.299...])
>>> clf.intercept_
1.600...
>>> clf.predict([[1, 1], [3, 4]])
array([2.500..., 4.599...])

Methods

`fit`(X, y[, sample_weight])	Fit a Generalized Linear Model.
`get_params`([deep])	Get parameters for this estimator.
`predict`(X)	Predict using GLM with feature matrix X.
`score`(X, y[, sample_weight])	Compute D^2, the percentage of deviance explained.
`set_params`(**params)	Set the parameters of this estimator.

property family¶: Return the family of the regressor.

fit(X, y, sample_weight=None)[source]¶

Fit a Generalized Linear Model.

Parameters

X{array-like, sparse matrix} of shape (n_samples, n_features): Training data.
yarray-like of shape (n_samples,): Target values.
sample_weightarray-like of shape (n_samples,), default=None: Sample weights.

Returns

selfobject: Fitted model.

get_params(deep=True)[source]¶

Get parameters for this estimator.

Parameters

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

paramsdict: Parameter names mapped to their values.

predict(X)[source]¶

Predict using GLM with feature matrix X.

Parameters

X{array-like, sparse matrix} of shape (n_samples, n_features): Samples.

Returns

y_predarray of shape (n_samples,): Returns predicted values.

score(X, y, sample_weight=None)[source]¶

Compute D^2, the percentage of deviance explained.

D^2 is a generalization of the coefficient of determination R^2. R^2 uses squared error and D^2 deviance. Note that those two are equal for family='normal'.

D^2 is defined as \(D^2 = 1-\frac{D(y_{true},y_{pred})}{D_{null}}\), \(D_{null}\) is the null deviance, i.e. the deviance of a model with intercept alone, which corresponds to \(y_{pred} = \bar{y}\). The mean \(\bar{y}\) is averaged by sample_weight. Best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse).

Parameters

X{array-like, sparse matrix} of shape (n_samples, n_features): Test samples.
yarray-like of shape (n_samples,): True values of target.
sample_weightarray-like of shape (n_samples,), default=None: Sample weights.

Returns

scorefloat: D^2 of self.predict(X) w.r.t. y.

set_params(**params)[source]¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**paramsdict: Estimator parameters.

Returns

selfestimator instance: Estimator instance.

Examples using `sklearn.linear_model.TweedieRegressor`¶

sklearn.linear_model.TweedieRegressor¶

Examples using sklearn.linear_model.TweedieRegressor¶

`sklearn.linear_model`.TweedieRegressor¶

Examples using `sklearn.linear_model.TweedieRegressor`¶