Create a zero-inflated negative binomial distribution
Source:R/ZINegativeBinomial.R
ZINegativeBinomial.Rd
Zero-inflated negative binomial distributions are frequently used to model counts with overdispersion and many zero observations.
Details
We recommend reading this documentation on https://alexpghayes.github.io/distributions3/, where the math will render with additional detail.
In the following, let \(X\) be a zero-inflated negative binomial random variable with parameters
mu
= \(\mu\) and theta
= \(\theta\).
Support: \(\{0, 1, 2, 3, ...\}\)
Mean: \((1 - \pi) \cdot \mu\)
Variance: \((1 - \pi) \cdot \mu \cdot (1 + (\pi + 1/\theta) \cdot \mu)\)
Probability mass function (p.m.f.):
$$ P(X = k) = \pi \cdot I_{0}(k) + (1 - \pi) \cdot f(k; \mu, \theta) $$
where \(I_{0}(k)\) is the indicator function for zero and
\(f(k; \mu, \theta)\) is the p.m.f. of the NegativeBinomial
distribution.
Cumulative distribution function (c.d.f.):
$$ P(X \le k) = \pi + (1 - \pi) \cdot F(k; \mu, \theta) $$
where \(F(k; \mu, \theta)\) is the c.d.f. of the NegativeBinomial
distribution.
Moment generating function (m.g.f.):
Omitted for now.
See also
Other discrete distributions:
Bernoulli()
,
Binomial()
,
Categorical()
,
Geometric()
,
HurdleNegativeBinomial()
,
HurdlePoisson()
,
HyperGeometric()
,
Multinomial()
,
NegativeBinomial()
,
Poisson()
,
PoissonBinomial()
,
ZIPoisson()
,
ZTNegativeBinomial()
,
ZTPoisson()
Examples
## set up a zero-inflated negative binomial distribution
X <- ZINegativeBinomial(mu = 2.5, theta = 1, pi = 0.25)
X
#> [1] "ZINegativeBinomial(mu = 2.5, theta = 1, pi = 0.25)"
## standard functions
pdf(X, 0:8)
#> [1] 0.46428571 0.15306122 0.10932945 0.07809246 0.05578033 0.03984309 0.02845935
#> [8] 0.02032811 0.01452008
cdf(X, 0:8)
#> [1] 0.4642857 0.6173469 0.7266764 0.8047688 0.8605492 0.9003923 0.9288516
#> [8] 0.9491797 0.9636998
quantile(X, seq(0, 1, by = 0.25))
#> [1] 0 0 1 3 Inf
## cdf() and quantile() are inverses for each other
quantile(X, cdf(X, 3))
#> [1] 3
## density visualization
plot(0:8, pdf(X, 0:8), type = "h", lwd = 2)
## corresponding sample with histogram of empirical frequencies
set.seed(0)
x <- random(X, 500)
hist(x, breaks = -1:max(x) + 0.5)