Dissertation expanding upon 538's tennis prediction ELO model, but focusing solely on women's tennis.
TLDR:
- Uses ITF data as well as WTA (source: Jeff Sackman) but weights ITF games lower.
- Considers game score when deciding player ability change after match.
- Considers a players historical as well as current ability.
- Considers players form in an attempt to preempt upsets.
Full write up: SOURCE
Install dependencies:
poetry install
Activate environment:
poetry shell
Arguments (all required):
- yf - year from (WARNING: don't go pre 2010)
- yt - year to
- ts - test size (in years)
Sample:
Gets data from 2010 to 2020, fits to 2000 -> 2018 and predicts on 2019 and 2020
python run --yf 2010 --yt 2020 --ts 2
All model outputs are saved to data/03_output/
.
These include:
- Model performance (accuracy and brier scores)
- Rankings (as of end of test data)
- Model calibration
- Predictions appended to test data