A Method to the Madness: Predicting NCAA Tournament Success

By Frank Zhu

It’s time for March Madness. Every year since 1985, tens of millions of brackets attempting to predict the results of 63 college basketball games are submitted, and every year, none has stayed perfect. According to the NCAA, this year was the first year that a bracket has been perfect entering the Sweet Sixteen.

With the stakes in the millions of dollars (or occasionally, a billion dollars), it’s no wonder an online search for “march madness bracket tips” brings up hundreds of thousands of results. This article will test 3 myths of March Madness to help you build your bracket, using NCAA regular season data from 2002 to 2017 gathered from KenPom and Kaggle.

Since teams seeded lower in the NCAA tournament tend to be better than higher seeds, this article uses a metric called performance to adjust for this difference.

Performance is the actual number of games a team won in a tournament minus the average number of games teams with that seeding have historically won in a tournament. On average, each 1st seed has won 3.3 games every tournament. In the case of Virginia last year, they lost in the first round, failing to win a single game. As a result, their performance would be 0 (the number of games won) – 3.3 (games they were expected to win), or -3.3.

Every team which played in the NCAA tournament from 2002 – 2017 was given a performance rating, and each team’s performance was compared to stats such as points scored, points allowed, and team experience to determine which factors predicted tournament success. This article will use R2 as a measure of how well each factor correlates with performance. R2 is a value that falls between 0 and 1, and it represents the percentage of variance explained by the model. The closer it is to 1, the better more closely the factor is related to a team under/over-performing their seeding.

Myth #1: Offense/Defense wins championships.

As seen in the graphs above, having a great offense or a great defense during the regular season are not great predictors of success. The regression of defense vs. performance had an R2  value of .006, and the regression of offense vs. performance had an R2 value of .013, which is close to zero and indicates little correlation.

Furthermore, there is a wide range of performance values for each X value. But what if the key to success is having both a good offense and a good defense? Below is a graph of performance against a team’s combined offensive and defensive rank. Lower is better – if a team ranked first in offense, their offense was the best in the country, and higher ranked teams had worse offenses.

On closer inspection, the R-squared value for the correlation is 0.019, indicating very little correlation. However, the trendline appears to slope upwards, paradoxically indicating a worse offense and defense leads to success.

To resolve this issue, let’s look at four points (highlighted in green) that stand out on the right of the graph – 2012 Norfolk State, 2014 UAB, 2008 San Diego State, and 2005 Bucknell University. What these schools share in common is they caused upsets – 15th seeded Norfolk State knocked off a 2nd seeded Missouri team, UAB and Bucknell were 14th seeds, and 13th seeded San Diego won its first-round matchup against 4th seeded Connecticut.

These four teams are indicative of a wider trend in the data – since expectations for lower-seeded teams are minimal, winning a single game causes them to overperform expectations –  the average games a tournament a 13th, 14th, and 15th seed will win are .234, .125, and .078, respectively. This graph shows being the better team overall is not a guarantee of success – upsets do happen. Let’s take a look at two more myths.


Myth #2: 3-Pointers are the key to success

Like the NBA, college basketball teams are shooting more threes than ever before. Does launching more threes or being accurate from deep lead to more success?

The answer is no. The spread of performance is highly variable. Teams reliant on the three have overperformed, and they have crashed out in the first round. Teams who shoot fewer threes have also overperformed, and they have also lost early. Both the 3-point rate (R2 = 0.0009), which is the number of possessions ended by shooting a three, and the percent of three-pointers made (R2 = 0.0009) did not correlate with a team’s performance.

Myth #3: Experience Matters

Experience is one of those “intangibles” announcers love – and when March Madness pits an underdog team with veteran players versus a team of “one-and-dones” and five-star recruits, it makes for a compelling storyline. But does having a team with more “basketball IQ” and more games under their belts lead to outperforming expectations?

To determine whether experience affects performance, KenPom’s experience metric was used.  Experience (in years) was calculated by taking each team’s players and weighing their class year (0 for a freshman, 1 for a sophomore, etc…) by regular season minutes played. A team with an experience closer to 0 relies more on freshmen, and a team with a higher experience would have more upperclassmen.

From the graph of average team experience (R2 = .0002), there is no correlation between having more experienced players and outperforming expectations. This could be due to the fact that freshmen who declare for the NBA draft after one year – the “one-and-dones” – are typically extremely talented, and their talent could make up for a possible lack of experience.

As a caveat, however, this measure of experience does not take into account two factors: redshirting, which would indicate a player is a class year older than they are, and regular season injuries, which could mean the distribution of minutes is not accurate for the team during the tournament.

So we’ve busted several bracket building myths. Having a great offense or defense (or both) doesn’t predict overperformance. Looking for teams that resemble the Houston Rockets and haul up dozens of threes a game won’t help, and even though cheering for a team with veteran players might be fantastic, there is no discernible effect of player experience on success.

However, at its core, the unpredictability of March Madness is what makes it so entertaining. Regardless if your bracket nails the first 20 games or gets none of them, good luck and enjoy watching!

If you have any questions for Frank about this article, please feel free to reach out to him at frank_zhu@college.harvard.edu

About the author

harvardsports

View all posts

1 Comment

  • I don’t know of a single person claiming that raw pts scored or allowed would predict tourney out-performance. Same with shooting three pointers.

    How about things like pace and adjusted metrics?

Leave a Reply

Your email address will not be published. Required fields are marked *