Boosting markovian tennis prediction: Ensembling and point-specific methods for match outcome and duration
Point-based hierarchical Markov models for tennis provide transparency and flexibility in predicting the outcome and duration of matches, but have been shown to fall behind in predictive power compared to other methods such as regression. In this paper, a preliminary study is first conducted which highlights the point-based model`s preference for data quantity over quality with respect to time horizons and surface-filtering. Fixing these factors, the main study then ensembles point-based methods in the literature along with two novel methods, (i) using exclusively head-to-head (H2H) data and (ii) point-specific modifications, to significantly boost match outcome prediction accuracy. Consensus model ensembles, in particular, boost average prediction accuracies to around 70%, on par with machine learning models. Progress is also made in duration prediction with point-based models. Point-specific modifications and rudimentary mean duration ensembles show promise in lowering root mean squared error (RMSE), with the latter`s performance strongly correlated with outcome prediction strength for corresponding outcome prediction ensembles. Overall, the positive results speak to the continued relevance of more traditional probabilistic prediction methods and adds to the literature on the considerable potential in ensembling. Studies are conducted with data from professional men`s and women`s matches played during 2011-2022, with the years 2014, 2018, and 2022 set aside for testing.
© Copyright 2026 Journal of Sports Analytics. IOS Press. All rights reserved.
| Subjects: | |
|---|---|
| Notations: | sport games technical and natural sciences |
| Published in: | Journal of Sports Analytics |
| Language: | English |
| Published: |
2026
|
| Volume: | 12 |
| Document types: | article |
| Level: | advanced |