Elo Rating System

NCAA Men's Basketball Tournament Model — Technical Reference

1. Overview

Our Elo system rates every Division I men's basketball team on a continuous scale centered at 1500. It processes every regular-season game from 1985 to the present, updating both teams' ratings after each result. The key properties:

Elo is the #1 feature by importance (gain) in our LightGBM tournament model, ahead of KenPom rank, AdjEM, and Massey ratings.

2. Core Formulas

Expected Win Probability

Given team A with rating $R_A$ and team B with rating $R_B$, with a home-court adjustment $H$:

$$E_A = \frac{1}{1 + 10^{(R_B - R_A - H) \,/\, 400}}$$

This is the standard logistic Elo curve. A 400-point Elo advantage gives ~91% expected win probability. A 200-point advantage gives ~76%.

Expected score function

Margin of Victory Multiplier

We scale updates by the game's margin to reward dominant wins more than narrow ones:

$$M = 0.8 \cdot \ln(\text{MOV} + 1)$$

The logarithm compresses large margins, preventing a single blowout from distorting ratings. A 10-point win gets a 1.9x multiplier; a 30-point win gets 2.7x — diminishing returns.

MOV multiplier

Rating Update

After each game, both teams' ratings update:

$$\Delta = K \cdot M \cdot (S - E_A)$$ $$R_A' = R_A + \Delta \qquad R_B' = R_B - \Delta$$

Where $S = 1$ for team A winning, $S = 0$ for losing, and $K = 20$ is the learning rate. The term $(S - E_A)$ is the surprise factor: a heavy favorite winning an expected game barely moves; an upset creates a large swing.

Intuition: If a team with Elo 2000 beats a team with Elo 1800 by 10 points, the expected probability was ~76%, and the MOV multiplier is 1.9x. The update is $20 \times 1.9 \times (1 - 0.76) = 9.1$ points. But if the weaker team wins by the same margin, the update is $20 \times 1.9 \times (1 - 0.24) = 28.9$ points — three times larger.

3. Parameters

ParameterValueDescription
K20Learning rate. Controls how quickly ratings respond to new results. Higher K = more reactive, lower K = more stable.
HOME_ADV3Elo points added to the home team's effective rating. Small because college basketball has variable home-court effects.
CARRYOVER0.75Fraction of (Elo − 1500) retained between seasons. At 0.75, a team ending at 2000 starts the next year at 1875.
Initial rating1500Default for any team appearing for the first time.

4. Game-by-Game Pipeline

For each game in chronological order (1985–2026):

1
Initialize any new team at 1500
2
Compute home-court adjustment: $H = +3$ if winner is home, $-3$ if away, $0$ if neutral
3
Compute expected probability $E_A$ using the logistic formula
4
Compute MOV multiplier $M = 0.8 \cdot \ln(\text{margin} + 1)$
5
Update: $\Delta = 20 \times M \times (1 - E_A)$ for the winner; subtract same from loser

5. Season-to-Season Regression

At the start of each new season, every team's rating regresses toward 1500:

$$R_{\text{new season}} = 1500 + 0.75 \times (R_{\text{end of prev}} - 1500)$$

This accounts for roster turnover, coaching changes, and mean reversion. A dominant team (Elo 2100) drops to 1950; a weak team (Elo 1300) rises to 1350. Everyone gets pulled toward average, but strong teams retain most of their advantage.

Season regression
Key insight: Carryover creates a "reputation" component in Elo. Duke starts the 2025-26 season at 1870 (highest among contenders) because they were strong last year. Michigan starts at just 1719 — they have to earn every point this season. This is why Michigan's Elo gain (+324) is the largest among all contenders despite ending below Duke.

6. 2025-26 Elo Trajectories

Elo trajectories

The trajectory chart reveals distinct team profiles:

7. Elo Composition

We decompose each team's final Elo into its sources: how much comes from last year's carryover, wins against strong/mid/weak opponents, and losses.

Elo composition
TeamStart W vs Top 25W vs MidW vs Weak L vs Top 25L vs MidL vs Weak FinalRecord
Duke 1870 +47.0 (4) +78.3 (7) +95.9 (21) 0.0 (0) -24.8 (2) 0.0 (0) 2067 32-2
Michigan 1719 +145.6 (7) +110.3 (9) +99.3 (14) -14.6 (1) -16.4 (1) 0.0 (0) 2043 30-2
Arizona 1769 +131.4 (7) +67.8 (7) +97.4 (17) -16.8 (1) -15.5 (1) 0.0 (0) 2033 31-2
Florida 1878 +21.8 (1) +124.5 (8) +99.7 (17) -4.6 (1) -81.2 (4) -34.8 (2) 2004 26-7
UConn 1772 +42.6 (2) +73.6 (6) +144.7 (23) -21.2 (1) -12.8 (1) -29.8 (1) 1969 31-3
Houston 1909 +4.7 (1) +58.7 (6) +97.6 (21) -90.0 (5) -18.1 (1) 0.0 (0) 1962 28-6
Illinois 1725 +31.9 (2) +131.7 (8) +136.7 (17) -30.0 (2) -63.4 (5) 0.0 (0) 1932 27-7
Michigan St 1825 +22.1 (2) +98.0 (7) +98.0 (17) -51.7 (3) -41.8 (2) -39.1 (2) 1911 26-7
Iowa St 1767 +38.6 (2) +87.0 (6) +130.4 (19) -35.1 (2) -61.8 (2) -56.6 (2) 1870 27-6
Purdue 1737 +0.0 (0) +158.2 (9) +127.2 (15) -53.3 (4) -106.0 (5) 0.0 (0) 1863 24-9
Reading the table: Each cell shows the Elo points gained/lost, with the number of games in parentheses. "W vs Top 25" = Elo gained from beating opponents rated 1850+. Michigan earned +145.6 Elo from just 7 top-25 wins — the highest quality-win contribution of any contender.

8. Worked Example: Michigan's Last 15 Games

OpponentResultMOV Opp EloExpectedΔ EloElo After
Indiana W 86-72 +14 1698 78.6% +9.3 1931
Ohio St W 74-62 +12 1726 76.7% +9.5 1940
Nebraska W 75-72 +3 1853 62.7% +8.3 1948
Michigan St W 83-71 +12 1944 50.2% +20.5 1969
Ohio St W 74-62 +12 1709 82.0% +7.4 1976
Nebraska W 75-72 +3 1838 69.3% +6.8 1983
Michigan St W 83-71 +12 1916 59.1% +16.8 2000
Penn St W 110-69 +41 1504 94.7% +3.2 2003
Ohio St W 82-61 +21 1712 84.0% +7.9 2011
Northwestern W 87-75 +12 1568 92.7% +3.0 2014
UCLA W 86-56 +30 1785 79.2% +11.4 2025
Purdue W 91-80 +11 1888 68.5% +12.5 2038
Duke L 68-63 -5 2028 50.9% -14.6 2023
Minnesota W 77-67 +10 1590 92.5% +2.9 2026
Illinois W 84-70 +14 1943 61.3% +16.8 2043

Notice how the Δ column reflects the surprise factor: wins against high-Elo opponents (where Expected is lower) produce larger gains. Losses create negative deltas, with upsets against weaker teams costing more.

9. Role in the Tournament Model

Elo enters the LightGBM model as elo_rating_diff — the difference between team A and team B's Elo ratings. It is the #1 feature by gain importance (3,947), ahead of:

RankFeatureGain
1elo_rating_diff3,947
2pom_rank_diff2,966
3AdjEM_diff2,736
4massey_best_rank_diff2,720
5bpi_kenpom_divergence_diff1,863

Why is Elo more important than KenPom's AdjEM? Because Elo captures dimensions that efficiency metrics miss:

The model uses Elo alongside 43 other features. Dropping Elo entirely increases log-loss by +4.5 mLL — the largest single-feature impact. But because Elo is highly correlated with KenPom rank and Massey ratings, the model can partially compensate when any one is removed.