Building the New York Times 4th Down Bot
Trey Causey
Data Scientist, ChefSteps
NFL Consultant
CSSS Certificate Holder
@treycausey
trey.causey@gmail.com
Kevin Quealy (@kevinq)
Deputy Editor
The Upshot / The New York Times
Josh Katz (@jshkatz)
Graphics Editor
The Upshot / The New York Times
Brian Burke (@bburkeESPN)
Senior Analytics Specialist
ESPN / Advanced Football Analytics
A Simple Statistical Model...
... deployed to production
... ready for any edge case
... fully automated
... a lesson in statistical decision-making
... "data journalism?"
The model is the easy part.
Romer, David. 2006. Journal of Political Economy 114(2), pp. 340-365.
Needed to make a decision
P(winning | pre-snap situation)
P(winning | successful conversion)
P(winning | failed conversion)
P(winning | successful field goal)
P(winning | failed field goal)
P(winning | punt)
Logistic regression (L2 penalty determined via cross-validation)
Dependent variable:
- Did team in this situation win the game? (not
expected points)
Inputs:
- Down, Distance, Yards to Go, Seconds Remaining
- Vegas line as a linear function of seconds remaining
- Score Difference, Offensive & Defensive Timeouts
- Quarter, (Score Difference * Quarter)
Precision: 0.77
Recall: 0.78
F1 Score: 0.77
AUC: 0.86
Wrinkle: kickers growing more accurate over time
Logistic regression controlling for
distance, kicker, weather, and stadium
(Katz)
Punting: expected net return from current
field position, non-punter specific (doesn't
allow for muffed punt)
For each game situation, estimate win
probability for each possible outcome.
Determine breakeven point -- probability
of successful 4th down conversion at which
coach should be indifferent between options.
Lots of manual smoothing required for
situations that 'look' wrong.
End of game situations have
extremely high leverage &
visibility
Examples:
1) In fourth quarter, weight win probabilities of not going for it by the probability of getting the ball back.
2) With < 40 seconds left in game within FG range,
set WP to P(successful FG).
Text
nyt4thdownbot.com
Every 4th down is analyzed
and explained with the
rationale for the call and
the breakeven point.
These charts ended up being
very popular!
The bot (@NYT4thDownBot)
has over 26,000 Twitter followers.
http://someurl.com/predict?
dwn=4&ytg=27&yfog=73
&secs_left=3005&score_diff=7
&timo=3&timd=0&spread=0
&ou_offense=40
&key=APIKEY
(API code is not open source for security purposes)