Families

Families define the distribution and link functions for GAM models. Each family corresponds to a statistical distribution and supports specific link functions.

Note

These families correspond to thin wrappers around R families, for example from mgcv or the stats package. The documentation here is more concise than those in the corresponding R packages. If useful information is missing, please consider contributing by making a pull request!

Continuous response

Gaussian

Gaussian(link: Literal['identity', 'log', 'inverse'] = 'identity')

Gaussian family with specified link function.

Parameters:

link (Literal['identity', 'log', 'inverse'], default: 'identity' ) –

The link function.

cdf

cdf(x: ndarray, *, mu: ndarray, wt: ndarray, scale: ndarray) -> ndarray

Gaussian CDF.

Gamma

Gamma(link: Literal['inverse', 'identity', 'log'] = 'inverse')

Gamma family with specified link function.

Parameters:

link (Literal['inverse', 'identity', 'log'], default: 'inverse' ) –

The link function for the Gamma family.

cdf

cdf(x: ndarray, *, mu: ndarray, wt: ndarray, scale: ndarray)

Gamma CDF, wt is ignored.

InverseGaussian

InverseGaussian(link: Literal['1/mu^2', 'inverse', 'identity', 'log'] = '1/mu^2')

Inverse Gaussian family with specified link function.

Parameters:

link (Literal['1/mu^2', 'inverse', 'identity', 'log'], default: '1/mu^2' ) –

The link function for the inverse Gaussian family.

Tweedie

Tweedie(
    p: float | int,
    link: Union[Literal["log", "identity", "inverse", "sqrt"], int, float] = 0,
)

Tweedie family with fixed power.

Parameters:

p (float | int) –

The variance of an observation is proportional to its mean to the power p. p must be greater than 1 and less than or equal to 2. 1 would be Poisson, 2 is gamma.
link (Union[Literal['log', 'identity', 'inverse', 'sqrt'], int, float], default: 0 ) –

If a float/int, treated as $\lambda$ in a link function based on $\eta = \mu^\lambda$, meaning 0 gives the log link and 1 gives the identity link (i.e. R stats package power). Can also be one of "log", "identity", "inverse", "sqrt".

Tw

Tw(
    link: Literal["log", "identity", "inverse", "sqrt"] = "log",
    a: float = 1.01,
    b: float = 1.99,
    theta: float | int | None = None,
    *,
    theta_fixed: bool = False,
)

Tweedie family with estimated power.

Restricted to variance function powers between 1 and 2.

Parameters:

link (Literal['log', 'identity', 'inverse', 'sqrt'], default: 'log' ) –

The link function to use.
a (float, default: 1.01 ) –

The lower bound of the power parameter for optimization.
b (float, default: 1.99 ) –

The upper bound of the power parameter for optimization.
theta (float | int | None, default: None ) –

Related to the Tweedie power parameter by $p=(a+b \exp(\theta))/(1+\exp(\theta))$. If this is supplied as a positive value then it is taken as the fixed value for p. If it is a negative values then its absolute value is taken as the initial value for p.
theta_fixed (bool, default: False ) –

If theta is provided, controls whether to treat theta as fixed or estimated. If estimated, then theta is the starting value.

Scat

Scat(
    link: Literal["identity", "log", "inverse"] = "identity",
    min_df: float | int = 3,
    theta: ndarray | None = None,
    *,
    theta_fixed: bool = False,
)

Scaled t family for heavy tailed data.

The idea is that $(y-\mu)/\sigma \sim t_\nu$ where $\mu$ is determined by a linear predictor, while $\sigma$ and $\nu$ are parameters to be estimated alongside the smoothing parameters.

Parameters:

link (Literal['identity', 'log', 'inverse'], default: 'identity' ) –

The link function to use.
min_df (float | int, default: 3 ) –

The minimum degrees of freedom. Must be >2 to avoid infinite response variance.
theta (ndarray | None, default: None ) –

The parameters to be estimated $\nu = b + \exp(\theta_1)$ (where $b$ is min_df) and $\sigma = \exp(\theta_2)$. If supplied and both positive, then taken to be fixed values of $\nu$ and $\sigma$. If any negative, then absolute values taken as starting values.
theta_fixed (bool, default: False ) –

If theta is provided, controls whether to treat theta as fixed or estimated. If estimated, then theta is the starting value.

MVN

MVN(d: int)

Multivariate normal family.

For this family, we expect $d$ linear predictors for the means, each with a key corresponding to a variable name in data. The covariance is estimated during fitting. For this family, deviance residuals are standardized to be approximately indpendent standard normal.

Parameters:

d (int) –

The dimension of the distribution.

Count and proportions

Poisson

Poisson(link: Literal['log', 'identity', 'sqrt'] = 'log')

Poisson family with specified link function.

Parameters:

link (Literal['log', 'identity', 'sqrt'], default: 'log' ) –

The link function for the Poisson family.

cdf

cdf(
    x: ndarray, *, mu: ndarray, wt: ndarray | None = None, scale: ndarray | None = None
)

Cumulative distribution function.

NegativeBinomial

NegativeBinomial(
    theta: float | int | None = None,
    link: Literal["log", "identity", "sqrt"] = "log",
    *,
    theta_fixed: bool = False,
)

Negative binomial family.

Parameters:

theta (float | int | None, default: None ) –

The positive parameter such that $\text{var}(y) = \mu + \mu^2/\theta$, where $\mu = \mathbb{E}[y]$.
link (Literal['log', 'identity', 'sqrt'], default: 'log' ) –

The link function to use.
theta_fixed (bool, default: False ) –

Whether to treat theta as fixed or estimated. If estimated, then theta is the starting value.

ZIP

ZIP(b: int | float = 0, theta: tuple[int | float, int | float] | None = None)

Zero-inflated Poisson family.

The probability of a zero count is given by $1-p$, whereas the probability of count $y>0$ is given by the truncated Poisson probability function $p\mu^y/((\exp(\mu)-1)y!)$. The linear predictor gives $\log \mu$, while $\eta = \log(-\log(1-p))$ and $\eta = \theta_1 + \{b+\exp(\theta_2)\} \log \mu$. The theta parameters are estimated alongside the smoothing parameters. Increasing the b parameter from zero can greatly reduce identifiability problems, particularly when there are very few non-zero data.

The fitted values for this model are the log of the Poisson parameter. Use the predict function with type=="response" to get the predicted expected response. Note that the theta parameters reported in model summaries are $\theta_1 and b + \exp(\theta_2)$.

Warning

These models should be subject to very careful checking, especially if fitting has not converged. It is quite easy to set up models with identifiability problems, particularly if the data are not really zero inflated, but simply have many zeroes because the mean is very low in some parts of the covariate space.

Parameters:

b (int | float, default: 0 ) –

A non-negative constant, specifying the minimum dependence of the zero inflation rate on the linear predictor.
theta (tuple[int | float, int | float] | None, default: None ) –

The 2 parameters controlling the slope and intercept of the linear transform of the mean controlling the zero inflation rate. If supplied then treated as fixed parameters (\theta_1 and \theta_2), otherwise estimated.

Betar

Betar(
    phi: float | int,
    link: Literal["logit", "probit", "cauchit", "cloglog"] = "logit",
    eps: float = 1e-10,
)

Beta regression family for use with GAM/BAM.

The linear predictor controls the mean $\mu$, and the variance is given by $\mu(1-\mu)/(1+\phi)$. Note, any observations too close to zero or one will be clipped to eps and 1-eps respsectively, to ensure the log likelihood is bounded for all parameter values.

Parameters:

phi (float | int) –

The parameter $\phi$, influencing the variance.
link (Literal['logit', 'probit', 'cauchit', 'cloglog'], default: 'logit' ) –

The link function to use.
eps (float, default: 1e-10 ) –

Amount to clip values too close to zero or one.

Categorical and ordinal

Binomial

Binomial(link: Literal['logit', 'probit', 'cauchit', 'log', 'cloglog'] = 'logit')

Binomial family with specified link function.

The response can be integers of zeros and ones (for binary data), proportions between zero and one (in which case the count can be incorporated as a weight), or a two-column matrix with the success and failure counts.

Parameters:

link (Literal['logit', 'probit', 'cauchit', 'log', 'cloglog'], default: 'logit' ) –

The link function. "logit", "probit" and "cauchit", correspond to logistic, normal and Cauchy CDFs respectively. "cloglog" is the complementary log-log.

cdf

cdf(x: ndarray, *, mu: ndarray, wt: ndarray, scale: ndarray)

Binomial CDF, scale is ignored.

OCat

OCat(num_categories: int)

Ordered categorical family.

The response should be integer class labels, indexed from 1 (not a pandas ordered Categorical)!

Parameters:

num_categories (int) –

The number of categories.

Multinom

Multinom(k: int = 1)

Multinomial family.

Categories must be coded as integers from 0 to K. This family can only be used with GAM. k predictors should be specified, with the first key matching the target variables name in the data. For the 0-th index, i.e. y=0, the likelihood is $1 / [1+\sum_j \exp(\eta_j)$, where $\eta_j$ is the j-th linear predictor. For y>0, it is given by $\exp(\eta_{y})/(1+\sum_j \exp(\eta_j))$.

Parameters:

k (int, default: 1 ) –

There are k+1 categories, and k linear predictors.

Location-scale

GauLSS

GauLSS(
    link: Literal["identity", "inverse", "log", "sqrt"] = "identity",
    min_std: float = 0.01,
)

Gaussian location-scale model family for GAMs.

Models both the mean $\mu$ and standard deviation $\sigma$ of a Gaussian response. The standard deviation uses a "logb" link, i.e. $\eta = \log(\sigma - b)$ to avoid singularities near zero.

Only compatible with GAM, to which two predictors must be specified, for the response variable and the scale respectively.

Predictions with type="response" returns columns [mu, 1/sigma]
Predictions with type="link" returns columns [eta_mu, log(sigma - b)]
Plots use the log(sigma - b) scale.

Parameters:

link (Literal['identity', 'inverse', 'log', 'sqrt'], default: 'identity' ) –

The link function to use for $\mu$.
min_std (float, default: 0.01 ) –

Minimum standard deviation $b$, for the "logb" link.

GammaLS

GammaLS(min_log_scale: float | int = -7)

Gamma location-scale model family.

The log of the mean, $\mu$, and the log of the scale parameter, $\phi$ can depend on additive smooth predictors (i.e. using two formulae).

Parameters:

min_log_scale (float | int, default: -7 ) –

The minimum value for the log scales parameter.

GevLSS

GevLSS(
    location_link: Literal["identity", "log"] = "identity",
    shape_link: Literal["identity", "logit"] = "logit",
)

Generalized extreme value location, scale and shape family.

Requires three predictors, one for the location, log scale and the shape.

Uses the p.d.f. $t(y)^{\xi+1} e^{-t(y)} / \sigma$, where: $t(x) = [1 + \xi(y-\mu)/\sigma]^{-1/\xi}$ if $\xi \neq 0$ and $\exp[-(y-\mu)/\sigma]$ otherwise.

Parameters:

location_link (Literal['identity', 'log'], default: 'identity' ) –

The link function to use for $\mu$.
shape_link (Literal['identity', 'logit'], default: 'logit' ) –

The link function to use for $\xi$.

GumbLS

GumbLS(scale_link: Literal['identity', 'log'] = 'log', min_log_scale: float = -7)

Gumbel location scale additive model.

gumbls fits Gumbel location–scale models with a location parameter $\mu$ and a log scale parameter $\beta$.

For $z = (y - \mu) e^{-\beta}$, the log Gumbel density is $\ell = -\beta - z - e^{-z}$. The mean is $\mu + \gamma e^{\beta}$, and the variance is $\pi^2 e^{2\beta}/6$.

Note predictions on the response scale will return the log scale $\beta$

Warning

Read the documentation for the scale_link parameter, which is potentially confusing (inherited from mgcv).

Parameters:

scale_link (Literal['identity', 'log'], default: 'log' ) –
The link for the log scale parameter $\beta$, defined as followed:
- scale_link="identity": linear predictor directly gives β.
- scale_link="log": ensures $\beta > b$ using $\beta = b + log(1 + exp(η))$.
min_log_scale (float, default: -7 ) –

The minimum value for the log scale parameter (b above) if using the log link.

Shash

Shash(b: float = 0.01, phi_pen: float = 0.001)

Sinh-arcsinh location scale and shape model family.

Implements the four-parameter sinh-arcsinh (shash) distribution of Jones and Pewsey (2009). The location, scale, skewness and kurtosis of the density can depend on additive smooth predictors. Requires four predictors, with the first (the location), corresponding to a variable name in the data, and the rest denoting the scale, skewness and kurtosis (for which any names can be chosen).

The density function is: $$ p(y|\mu,\sigma,\epsilon,\delta)=C(z) \exp{-S(z)^2/2} {2\pi(1+z^2)}^{-1/2}/\sigma $$

where $C(z) = \{1+S(z)^2\}^{1/2}$, $S(z) = \sinh\{\delta \sinh^{-1}(z) - \epsilon\}$ and $z = (y - \mu)/(\sigma \delta)$. $\mu$ controls the location, $\sigma$ controls the scale, $\epsilon$ controls the skewness, and $\delta$ the tail weight. For fitting purposes, we use $\tau = \log(\sigma)$ and $\phi = \log(\delta)$.

The link functions are fixed at identity for all parameters except the scale $\tau$, which uses logeb, defined as $\eta = \log [\exp(\tau) - b]$, such that the inverse is $\tau = \log(\sigma) = \log\{\exp(\eta)+b\}$.

Parameters:

b (float, default: 0.01 ) –

Positive parameter for the minimum scale of the logeb link function for the scale parameter.
phi_pen (float, default: 0.001 ) –

Positive multiplier of a ridge penalty on kurtosis parameter, shrinking towards zero.

Quasi-likelihood

Quasi

Quasi(
    link: Literal[
        "logit", "probit", "cloglog", "identity", "inverse", "log", "1/mu^2", "sqrt"
    ] = "identity",
    variance: Literal["constant", "mu(1-mu)", "mu", "mu^2", "mu^3"] = "constant",
)

Quasi family with specified link and variance functions.

Parameters:

link (Literal['logit', 'probit', 'cloglog', 'identity', 'inverse', 'log', '1/mu^2', 'sqrt'], default: 'identity' ) –

The link function for the quasi family.
variance (Literal['constant', 'mu(1-mu)', 'mu', 'mu^2', 'mu^3'], default: 'constant' ) –

The variance function for the quasi family.

QuasiBinomial

QuasiBinomial(link: Literal['logit', 'probit', 'cauchit', 'log', 'cloglog'] = 'logit')

Quasi-binomial family with specified link function.

Parameters:

link (Literal['logit', 'probit', 'cauchit', 'log', 'cloglog'], default: 'logit' ) –

The link function for the quasi-binomial family.

QuasiPoisson

QuasiPoisson(link: Literal['log', 'identity', 'sqrt'] = 'log')

Quasi-Poisson family with specified link function.

Parameters:

link (Literal['log', 'identity', 'sqrt'], default: 'log' ) –

The link function for the quasi-Poisson family.

Not yet implemented

TwLSS

TwLSS()

Not yet implemented.

ZipLSS

ZipLSS()

Not yet implemented.

CNorm

CNorm()

Not yet implemented.

CLog

CLog()

Not yet implemented.

CPois

CPois()

Not yet implemented.

CoxPH

CoxPH()

Not yet implmented.

Additive Cox Proportional Hazard Model.

Cox Proportional Hazards model with Peto's correction for ties, optional stratification, and estimation by penalized partial likelihood maximization, for use with GAM. In the model formula, event time is the response.

Under stratification the response has two columns: time and a numeric index for stratum. The weights vector provides the censoring information (0 for censoring, 1 for event). CoxPH deals with the case in which each subject has one event/censoring time and one row of covariate values.

Base classes

AbstractFamily

Provides default implmentations for distribution methods.

This applies mgcv fix.family.qf for the quantile function, and fix.family.rd for the sampling function.

n_predictors `property`

n_predictors

Return the total number of predictors.

link

link(x: ndarray) -> ndarray

Compute the link function.

inverse_link

inverse_link(x: ndarray) -> ndarray

Compute the inverse link function.

dmu_deta

dmu_deta(x: ndarray) -> ndarray

Compute the derivative dmu/deta of the link function.

sample

sample(
    mu: int | float | ndarray,
    wt: int | float | ndarray | None = None,
    scale: int | float | ndarray | None = None,
)

Sample the family distributions (R family rd method).

SupportsCDF

Mixin for families supporting cumulative distribution functions.

cdf

cdf(x: ndarray, *, mu: ndarray, wt: ndarray, scale: ndarray) -> ndarray

Cumulative distribution function.

Families

Continuous response

Gaussian

cdf

Gamma

cdf

InverseGaussian

Tweedie

Tw

Scat

MVN

Count and proportions

Poisson

cdf

NegativeBinomial

ZIP

Betar

Categorical and ordinal

Binomial

cdf

OCat

Multinom

Location-scale

GauLSS

GammaLS

GevLSS

GumbLS

Shash

Quasi-likelihood

Quasi

QuasiBinomial

QuasiPoisson

Not yet implemented

TwLSS

ZipLSS

CNorm

CLog

CPois

CoxPH

Base classes

AbstractFamily

n_predictors property

link

inverse_link

dmu_deta

sample

SupportsCDF

cdf

n_predictors `property`