scipy.stats.poisson — Official Documentation¶
Summary¶
The scipy.stats.poisson module is the standard Python implementation for working with Poisson distributions and is the backbone of all Poisson-based sports prediction models. It provides pmf(k, mu) for the probability mass function and cdf(k, mu) for the cumulative distribution function, along with sampling, moments, and interval estimation.
This documentation is the authoritative reference for how to correctly use Poisson distributions in Python. Key notes: the mu parameter is the shape parameter (λ = mean = variance), and pmf(k, mu) = exp(-μ) × μ^k / k!. For shifted distributions, use the loc parameter.
Key Functions¶
poisson.pmf(k, mu): P(X = k) for Poisson(μ). Core function for computing goal probabilities.poisson.cdf(k, mu): P(X ≤ k) — useful for cumulative probability calculationspoisson.sf(k, mu): P(X > k) = 1 - cdf — survival functionpoisson.rvs(mu, size): Random variate generation for simulationpoisson.interval(alpha, mu): Confidence interval for the Poisson meanpoisson.mean(mu),poisson.var(mu),poisson.std(mu): Moments (mean = var = μ by definition)
Key Concepts¶
- Shape parameter
mu: Both the mean and variance of the distribution. For football: expected goals (xG) for a team in a match. - loc parameter: Allows shifting the distribution.
poisson.pmf(k, mu, loc)is equivalent topoisson.pmf(k - loc, mu). Useful for cases where the minimum goal count is shifted. - Numerical stability: For large μ (e.g., λ > 100), consider using normal approximation. SciPy handles most cases well.
- Broadcasting:
pmfaccepts array inputs for k, returning an array of probabilities — ideal for vectorized scoreline matrix computation.
Code Examples¶
from scipy.stats import poisson
import numpy as np
# Single goal probability
p_one_goal = poisson.pmf(1, mu=1.5) # P(X=1) for λ=1.5
# Full distribution
mu = 1.5
goals = np.arange(0, 8)
probs = poisson.pmf(goals, mu)
# [e^-1.5, e^-1.5*1.5/1, e^-1.5*1.5^2/2!, ...]
# Scoreline matrix (vectorized)
def scoreline_matrix(lambda_home, lambda_away, max_goals=6):
h = np.arange(max_goals + 1)[:, None] # column
a = np.arange(max_goals + 1)[None, :] # row
return poisson.pmf(h, lambda_home) * poisson.pmf(a, lambda_away)
# Cumulative probability (P(X >= 1))
p_at_least_one = poisson.sf(0, mu=1.5) # 1 - P(X=0)
# Confidence interval for λ estimate
low, high = poisson.interval(0.95, mu=1.5)
Notes¶
- The existing
poisson-distribution.mdnote already includes scipy code; this source adds the official documentation reference and confirms the API contract - Key detail:
poisson.pmfaccepts arraykvalues — the vectorized scoreline matrix in the existing note relies on this broadcasting behavior - For production sports models: always use
scipy.stats.poissonrather than manual implementations to avoid numerical errors in edge cases - The
locparameter is rarely needed for football xG models but useful for specialty applications