Bayesian inference on baseball batting metrics using Nested Dirichlet Multinomial models
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Bayesian methods and sports analytics have become popular areas of research, but their intersection remains under-considered in Statistics. While the Nested Dirichlet distribution has been proposed as a conjugate prior to Multinomial data commonly produced in sports settings, many properties of this distribution and its generalizability have not been explored. In this thesis, we propose a different parameterization for the Nested Dirichlet distribution and explain how it can be used to derive the posterior and posterior predictive distributions for the Nested Dirichlet Multinomial model. We also demonstrate the generalizability of our parameterization, an important improvement over the approaches that can be found in the literature. Finally, we present how we used this model to analyze 2023 MLB batting outcome data to produce model-based batting metrics with appropriate uncertainty quantification and compare the results to those produced by the traditional Dirichlet Multinomial model. We found that each model works well under different circumstances, leading us to suggest methods for their improvement in future research. For example, creating mixture models to best capture different player abilities and styles, amongst other suggestions.