Abstract:This paper proposes a new way of evaluating the accuracy and validity of probabilistic forecasts that change over time (such as an in-game win probability model, or an election forecast). Under this approach, each model to be evaluated is treated as a canonical Kelly bettor, and the models are pitted against each other in an iterative betting contest. The growth or decline of each model's bankroll serves as the evaluation metric. Under this approach, market consensus probabilities and implied model credibilities can be updated real time as each model updates, and do not require one to wait for the final outcome. Using a simulation model, it will be shown that this method is in general more accurate than traditional average log-loss and Brier score methods at distinguishing a correct model from an incorrect model. This Kelly approach is shown to have a direct mathematical and conceptual analogue to Bayesian inference, with bankroll serving as a proxy for Bayesian credibility.
Submission history
From: Michael Beuoy [view email]
[v1]
Tue, 10 Feb 2026 17:08:35 UTC (2,372 KB)