ELO Rating System

Overview

The ELO rating system is a method for calculating the relative skill levels of players in zero-sum two-player games. It was invented by Hungarian-American chess master Arpad Elo in 1959–1960, replacing the Harkness rating system. The system is named after Elo and is a special case of the bradley-terry-model. Elo's central assumption is that each player's performance in each game is a normally distributed random variable around their true skill mean.

After each game, the winner takes points from the loser — the difference determines how many points are transferred. Upset wins (lower-rated player beating higher-rated) result in larger point transfers. The system is self-correcting: players whose ratings are too low will outperform their rating and gain points until ratings reflect true strength.

ELO has been adapted for many sports including football, American football, baseball, basketball, tennis, and esports. FiveThirtyEight famously uses ELO variants for sports predictions.

Why It Matters

ELO provides a simple, interpretable team strength metric that updates after every match. For sports betting models, ELO ratings serve as:
1. Team strength input to other models (e.g., Poisson λ estimation)
2. Standalone prediction engine via the win probability formula
3. Historical baseline for comparing model performance

The key advantage over Massey ratings is that ELO updates incrementally — a team's rating adapts to recent form — while Massey solves all games simultaneously and is better suited for ranking than prediction.

Key Formula

Expected score for player A vs player B:

$$E_A = \frac{1}{1 + 10^{(R_B - R_A)/400}}$$

Rating update after game:

$$R_A' = R_A + K(S_A - E_A)$$

Where S_A = 1 (win), 0.5 (draw), 0 (loss), and K is the development coefficient.

Home advantage adjustment:

$$E_{home} = \frac{1}{1 + 10^{(R_{away} - R_{home} - home_advantage)/400}}$$

Typical home advantage: 55–100 ELO points (FiveThirtyEight uses ~55 for soccer).

Worked Example

Man City (ELO1850) vs Arsenal (ELO 1820), home advantage = 65:

$$E_{ManCity} = \frac{1}{1 + 10^{(1820 - 1850 - 65)/400}} = \frac{1}{1 + 10^{-95/400}} = \frac{1}{1 + 0.791} = 0.558$$

Man City has55.8% expected win probability.

After match (Man City wins 2-1):

$$R_{ManCity}' = 1850 + 32(1 - 0.558) = 1850 + 14.1 = 1864$$
$$R_{Arsenal}' = 1820 + 32(0 - 0.442) = 1820 - 14.1 = 1806$$

Code Snippet

import math

def expected_score(rating_a, rating_b, home_advantage=0):
    """Expected win probability for player A vs player B."""
    return1 / (1 + 10 ** ((rating_b - rating_a - home_advantage) / 400))

def update_elo(rating, expected, actual, k=32):
    """Update ELO rating after a game. actual = 1/0.5/0."""
    return rating + k * (actual - expected)

def match_probabilities(rating_home, rating_away, home_advantage=65):
    """Convert ELO to 1X2 probabilities."""
    e_home = expected_score(rating_home, rating_away, home_advantage)
    e_away = expected_score(rating_away, rating_home, -home_advantage)
    # Assume draw probability from symmetry
    prob_draw = 1 - e_home - e_away + (e_home + e_away) / 2
    return {"home": e_home, "draw":0.25, "away": e_away}

# Example
print(expected_score(1850, 1820, 65))  # 0.558
new = update_elo(1850, 0.558, 1.0, k=32)
print(f"Man City new rating: {new:.1f}")  # 1864.1

Pitfalls

  • K-factor choice is critical: K=32 (chess standard) updates quickly but is noisy; K=10 is too slow for sports. Most sports ELO implementations use K=20–32.
  • No uncertainty measure: A player with 5 games has the same weight as one with 500. glicko-2 addresses this with rating deviation (RD).
  • Home advantage is not constant: It varies by league, team, and tournament. Re-estimate periodically.
  • Rating pool dependency: ELO ratings are only comparable within the same rating pool. Don't compare ratings from different ELO systems directly.

See Also