Elo Rating System

NCAA Men's Basketball Tournament Model — Technical Reference

1. Overview

Our Elo system rates every Division I men's basketball team on a continuous scale centered at 1500. It processes every regular-season game from 1985 to the present, updating both teams' ratings after each result. The key properties:

Zero-sum: every point one team gains, the other loses
Self-correcting: upsets produce larger updates than expected outcomes
Margin-aware: blowouts count more than squeakers
Memory-preserving: 75% of a team's rating carries over between seasons

Elo is the #1 feature by importance (gain) in our LightGBM tournament model, ahead of KenPom rank, AdjEM, and Massey ratings.

2. Core Formulas

Expected Win Probability

Given team A with rating $R_A$ and team B with rating $R_B$, with a home-court adjustment $H$:

$$E_A = \frac{1}{1 + 10^{(R_B - R_A - H) \,/\, 400}}$$

This is the standard logistic Elo curve. A 400-point Elo advantage gives ~91% expected win probability. A 200-point advantage gives ~76%.

Margin of Victory Multiplier

We scale updates by the game's margin to reward dominant wins more than narrow ones:

$$M = 0.8 \cdot \ln(\text{MOV} + 1)$$

The logarithm compresses large margins, preventing a single blowout from distorting ratings. A 10-point win gets a 1.9x multiplier; a 30-point win gets 2.7x — diminishing returns.

Rating Update

After each game, both teams' ratings update:

$$\Delta = K \cdot M \cdot (S - E_A)$$ $$R_A' = R_A + \Delta \qquad R_B' = R_B - \Delta$$

Where $S = 1$ for team A winning, $S = 0$ for losing, and $K = 20$ is the learning rate. The term $(S - E_A)$ is the surprise factor: a heavy favorite winning an expected game barely moves; an upset creates a large swing.

Intuition: If a team with Elo 2000 beats a team with Elo 1800 by 10 points, the expected probability was ~76%, and the MOV multiplier is 1.9x. The update is $20 \times 1.9 \times (1 - 0.76) = 9.1$ points. But if the weaker team wins by the same margin, the update is $20 \times 1.9 \times (1 - 0.24) = 28.9$ points — three times larger.

3. Parameters

Parameter	Value	Description
`K`	20	Learning rate. Controls how quickly ratings respond to new results. Higher K = more reactive, lower K = more stable.
`HOME_ADV`	3	Elo points added to the home team's effective rating. Small because college basketball has variable home-court effects.
`CARRYOVER`	0.75	Fraction of (Elo − 1500) retained between seasons. At 0.75, a team ending at 2000 starts the next year at 1875.
Initial rating	1500	Default for any team appearing for the first time.

4. Game-by-Game Pipeline

For each game in chronological order (1985–2026):

Initialize any new team at 1500

Compute home-court adjustment: $H = +3$ if winner is home, $-3$ if away, $0$ if neutral

Compute expected probability $E_A$ using the logistic formula

Compute MOV multiplier $M = 0.8 \cdot \ln(\text{margin} + 1)$

Update: $\Delta = 20 \times M \times (1 - E_A)$ for the winner; subtract same from loser

5. Season-to-Season Regression

At the start of each new season, every team's rating regresses toward 1500:

$$R_{\text{new season}} = 1500 + 0.75 \times (R_{\text{end of prev}} - 1500)$$

This accounts for roster turnover, coaching changes, and mean reversion. A dominant team (Elo 2100) drops to 1950; a weak team (Elo 1300) rises to 1350. Everyone gets pulled toward average, but strong teams retain most of their advantage.

Key insight: Carryover creates a "reputation" component in Elo. Duke starts the 2025-26 season at 1870 (highest among contenders) because they were strong last year. Michigan starts at just 1719 — they have to earn every point this season. This is why Michigan's Elo gain (+324) is the largest among all contenders despite ending below Duke.

6. 2025-26 Elo Trajectories

The trajectory chart reveals distinct team profiles:

Duke (2067): Steady climber from a high base. Consistent excellence all season.
Michigan (2043): Steepest trajectory. Started lowest among 1-seeds but rocketed up through quality wins.
Houston (1962): Flat line. Started high from carryover (1909) but barely gained, meaning this season's results haven't justified last year's reputation.
Purdue (1863): Choppy with visible dips from losses. Most volatile among contenders.

7. Elo Composition

We decompose each team's final Elo into its sources: how much comes from last year's carryover, wins against strong/mid/weak opponents, and losses.

Team	Start	W vs Top 25	W vs Mid	W vs Weak	L vs Top 25	L vs Mid	L vs Weak	Final	Record
Duke	1870	+47.0 (4)	+78.3 (7)	+95.9 (21)	0.0 (0)	-24.8 (2)	0.0 (0)	2067	32-2
Michigan	1719	+145.6 (7)	+110.3 (9)	+99.3 (14)	-14.6 (1)	-16.4 (1)	0.0 (0)	2043	30-2
Arizona	1769	+131.4 (7)	+67.8 (7)	+97.4 (17)	-16.8 (1)	-15.5 (1)	0.0 (0)	2033	31-2
Florida	1878	+21.8 (1)	+124.5 (8)	+99.7 (17)	-4.6 (1)	-81.2 (4)	-34.8 (2)	2004	26-7
UConn	1772	+42.6 (2)	+73.6 (6)	+144.7 (23)	-21.2 (1)	-12.8 (1)	-29.8 (1)	1969	31-3
Houston	1909	+4.7 (1)	+58.7 (6)	+97.6 (21)	-90.0 (5)	-18.1 (1)	0.0 (0)	1962	28-6
Illinois	1725	+31.9 (2)	+131.7 (8)	+136.7 (17)	-30.0 (2)	-63.4 (5)	0.0 (0)	1932	27-7
Michigan St	1825	+22.1 (2)	+98.0 (7)	+98.0 (17)	-51.7 (3)	-41.8 (2)	-39.1 (2)	1911	26-7
Iowa St	1767	+38.6 (2)	+87.0 (6)	+130.4 (19)	-35.1 (2)	-61.8 (2)	-56.6 (2)	1870	27-6
Purdue	1737	+0.0 (0)	+158.2 (9)	+127.2 (15)	-53.3 (4)	-106.0 (5)	0.0 (0)	1863	24-9

Reading the table: Each cell shows the Elo points gained/lost, with the number of games in parentheses. "W vs Top 25" = Elo gained from beating opponents rated 1850+. Michigan earned +145.6 Elo from just 7 top-25 wins — the highest quality-win contribution of any contender.

8. Worked Example: Michigan's Last 15 Games

Opponent	Result	MOV	Opp Elo	Expected	Δ Elo	Elo After
Indiana	W 86-72	+14	1698	78.6%	+9.3	1931
Ohio St	W 74-62	+12	1726	76.7%	+9.5	1940
Nebraska	W 75-72	+3	1853	62.7%	+8.3	1948
Michigan St	W 83-71	+12	1944	50.2%	+20.5	1969
Ohio St	W 74-62	+12	1709	82.0%	+7.4	1976
Nebraska	W 75-72	+3	1838	69.3%	+6.8	1983
Michigan St	W 83-71	+12	1916	59.1%	+16.8	2000
Penn St	W 110-69	+41	1504	94.7%	+3.2	2003
Ohio St	W 82-61	+21	1712	84.0%	+7.9	2011
Northwestern	W 87-75	+12	1568	92.7%	+3.0	2014
UCLA	W 86-56	+30	1785	79.2%	+11.4	2025
Purdue	W 91-80	+11	1888	68.5%	+12.5	2038
Duke	L 68-63	-5	2028	50.9%	-14.6	2023
Minnesota	W 77-67	+10	1590	92.5%	+2.9	2026
Illinois	W 84-70	+14	1943	61.3%	+16.8	2043

Notice how the Δ column reflects the surprise factor: wins against high-Elo opponents (where Expected is lower) produce larger gains. Losses create negative deltas, with upsets against weaker teams costing more.

9. Role in the Tournament Model

Elo enters the LightGBM model as elo_rating_diff — the difference between team A and team B's Elo ratings. It is the #1 feature by gain importance (3,947), ahead of:

Rank	Feature	Gain
1	`elo_rating_diff`	3,947
2	`pom_rank_diff`	2,966
3	`AdjEM_diff`	2,736
4	`massey_best_rank_diff`	2,720
5	`bpi_kenpom_divergence_diff`	1,863

Why is Elo more important than KenPom's AdjEM? Because Elo captures dimensions that efficiency metrics miss:

Strength of schedule quality: beating good teams is worth more than beating bad teams by large margins
Historical momentum: carryover preserves program strength across seasons
Clutch performance: close wins against strong opponents build Elo efficiently
Consistency: losses to weak teams are heavily penalized, rewarding teams that avoid bad losses

The model uses Elo alongside 43 other features. Dropping Elo entirely increases log-loss by +4.5 mLL — the largest single-feature impact. But because Elo is highly correlated with KenPom rank and Massey ratings, the model can partially compensate when any one is removed.

Generated from build_feature_store.py::build_elo_ratings(). Data source: Kaggle March Machine Learning Mania 2026.
All charts produced with matplotlib. Formulas rendered with MathJax.