March Madness: Presenting the analytics.b(rack)et!

By Matthew Buchalter, PlusEV Analytics

First, two HUGE disclaimers.

  • My domain knowledge as it relates to NCAA basketball is pretty much zero. It’s just not something I’ve ever gotten myself into. College sports just don’t have the same cultural resonance in Canada as pro sports do. I remember attending a school football game as part of my university’s freshman orientation. There were a few hundred fans in the stands, and very little media coverage. It just wasn’t something many people cared about. The best part of it was my Waterloo Warriors getting crushed, as math/engineering heavy schools tend to do, by our rival Wilfrid Laurier Golden Hawks and the Waterloo student section chanting “it’s alright, it’s OK, you’re gonna work for us some day!” All this is to say, this is a 100% data driven, team-level model that does not consider injuries, roster changes, coaching changes, or any other reason why a team’s ability may have changed during the season. So, I don’t recommend using these results blindly.
  • The goal of this exercise is to build a probability distribution for the results of the tournament. If you’re playing in a bracket pool, this only gives you half of what you need to be successful. Estimating the distribution of your fellow competitors’ picks and using that, along with this, to build a “Game Theory Optimal” (#GTO) bracket is important in bracket pools but is not part of what I’m doing. These results are more directly applicable to betting; more specifically, to tournament futures and props.

OK, now that we’ve talked about what I can’t do, let’s get into what I can. A NCAA basketball team plays a relatively small number of games in a season, and several of those games are against “cupcake” opponents where the outcome is known ahead of time with near certainty and thus provides little new information.  So, we have a great example of a “small data” problem.

As I teach in Bayesian Sports Betting, a great way to handle “small data” problems is by using the idea of latent variables. A latent variable is something that has predictive value but is not directly observable and can be measured only by methods that are useful but imperfect. My favourite example to describe latent variables is skill in driving a car. We know that driving skill is a real thing, we know that different people have different levels of driving skill and we know that less skilled drivers have a higher probability of accidents than more skilled drivers, all else equal. Unfortunately, assessing an individual’s driving skill is difficult – there is no universal report card that all drivers get. (We’re getting closer to that reality with the advent of vehicle telematics devices, but that’s a different topic for a different day!) What we do know is that better drivers TEND to have cleaner records, both in terms of accident history and in terms of tickets for speeding or other violations. But these measures are imperfect – it’s possible for a terrible driver to have a clean record, or for a great driver to have a tainted one, through random good or bad luck. We’re still better off using the information than not using it, but we need to be aware of its limitations as it pertains to allowing us to “learn” the true nature of this latent variable.

Which brings me back to NCAA basketball. There are elite teams, there are awful teams, and there’s everything in between. But, the true ability of a basketball team is a latent variable – we can measure it indirectly through a team’s win-loss record, but that record is comprised of a mixture of the team’s true nature and the random variance that makes sports so awesomely unpredictable. For this model, I’m going to represent this latent variable by assigning each team a rating between 0 and 1, representing that team’s win probability on a neutral court against an average-rated opponent. Again, each team’s rating can be estimated from data and models but can never be known with certainty.

The Bayesian approach to latent variables is to start with an estimate, known as the “prior”, and then as new information comes in, combine the prior and the new information into an updated estimate called the “posterior”.

“A team plays to a range of numbers that isn’t that big, and they’ll do that year in and year out if they have the same coach. Because that program will recruit very similar players, the coach will run very similar stuff year in and year out and the team will be plus or minus 2 from some number, and not much more than that, and that’s pretty much etched in stone” – Handicapper Alan Boston, interviewed on Be Better Bettors Podcast, Jan 5 2022 (12:57 mark)

Translated into Bayesian terms, what Alan is saying is that any given team’s prior for any given season should be close to that team’s posterior from the end of previous season. I took that idea and added preseason rankings from a couple of public sources to formulate my prior rating estimates:

The priors were then updated using each team’s 2021-22 record for wins/losses, margin of victory and strength of schedule to result in the following estimates for the posterior rating estimates:

We can trace the path of each of our four #1 seeds from the pre-season priors, through a season’s worth of results and into the posteriors:

The nice thing about this rating system is that it’s simple to calculate the probability of any team beating any other team on a neutral court:

Probability of Team A beating Team B = Team A rating x (1 – Team B rating) / [Team A rating x (1 – Team B rating) + Team B rating x (1 – Team A rating)]

So we have the bracket, we have a posterior estimated rating for each team and we have a formula to convert those into win probabilities. It would seem like all we have to do is run 100,000 simulated tournaments and compile the results to get each team’s probability of winning. Here’s what we would get if we did that:

I imagine this is how pretty much all of the top analysts are figuring out their brackets, but with their own team ratings in place of mine. And yeah, I think my ratings are more mathematically sound than theirs, but they have access to more detailed data as well as domain knowledge that I don’t have, so maybe it’s a wash.

But, there’s more to the story.

When I say that Duke’s posterior rating estimate is 0.923, I am choosing my words very carefully. It’s not a rating, it’s a rating estimate. What’s the difference? In a tournament like this, a lot. Duke’s posterior rating COULD BE 0.923. It also could be something else. A strength of Bayesian modeling is that it is able to estimate not only parameter values but also ranges called prediction intervals. My 80% prediction interval for Duke is from 0.869 to 0.962, meaning that if we simulate 100 parallel universes, in 20 of those universes the team’s true rating will fall outside of that range. What we really need to inject into our model is a double shot of uncertainty – first we simulate what universe we’re in (i.e. each team’s true rating), then we simulate the outcome of the tournament in that universe. Parameter variance and process variance – the distinction between these two concepts and their importance in modeling for sports betting is one of the topics covered in detail in my Bayesian Sports Betting course.

To understand why parameter variance is especially important to March Madness, consider Team X, whose rating is known to be 0.90. This team would rank #16 and would have a high likelihood of making it to the Sweet 16 or Elite 8 but no further. Now, consider Team Y, whose rating is estimated to be 0.90 but is equally likely in reality to be 0.85 or 0.95. Compared to Team X, Team Y will have a higher likelihood of winning the tournament AND a higher likelihood of not surviving the first weekend. If I’m betting on the tournament winner or a regional winner and I treat Team Y as if they were Team X, I would underestimate their win probability. This is why it’s important to treat the teams’ rating as variable (“stochastic” if you’re fancy) rather than fixed.

Incorporating the posterior rating estimates WITH parameter uncertainty, here is the analytics.b(rack)et:

In addition to regional and tournament winners, we can use this bracket to price some common tournament props involving seeds:

Good luck, enjoy the games and may the variance be in your favour!

2 thoughts on “March Madness: Presenting the analytics.b(rack)et!”

  1. Can you go out to more decimals, so that teams are not tied in your power rating estimates?
    I do consider your Gonzaga/Kansas final much more likely than the Gonzaga/Arizona final, that many other analysts project.

  2. Interesting analysis. Shows both the power and limitations of analytics – in that this model is not performing particularly well. How would you adjust this to account for momentum through year end and during the tournament itself from rounds 64, 32, etc.? Thank you for sharing.

Leave a Comment

Your email address will not be published. Required fields are marked *

We uses cookies to analyze website traffic and optimize your website experience. By accepting our use of cookies, your data will be aggregated with all other user data.