By John Ezekowitz and Andrew Cohen

The allure of March Madness is due in no small part to the complementary joys of filling out a bracket and first and second round upsets. America loves an underdog, and the first four days of the NCAA Tournament are the perfect forum for underdogs and upsets. America also loves to be right; there is nothing quite so satisfying as picking the correct 12-5 upset. While in our fallible recollections it may seem as if there are plenty of upsets (defined as a victory by a team five or more seed lines below its opposition), less than a quarter of first and second round games are upsets.

What, then, separates the underdogs that are successful in their upset bids from those that are not? Join us after the jump as we analyze a theory put forward by no less of an authority than Dean Oliver, the father of modern tempo-free basketball statistics.In his book “What Wins Basketball Games,” Oliver hypothesizes that underdogs should engage in “risky strategies,” those that increase the variance of the score. One of these strategies would be slowing down the tempo of the games, leaving teams with fewer possessions. The logic goes that the favorite would have fewer chances to press their advantage on a per possession basis, increasing the variance of the score from the natural score, which obviously favors the more talented team.

On the surface, this logic makes sense; allowing the superior team fewer opportunities to score seems like a good, if risky, strategy. Luckily, we can empirically test this theory.

Using tempo data from Ken Pomeroy’s incredible website, we have analyzed every first and second round NCAA tournament game that could produce an upset from 2004 to 2009. We made the decision to exclude 1 vs 16 and 2 vs 15 matchups, since over the last decade those games have produced no upsets. That left us with a sample of 144 games to analyze.

The results were, well, stunning. There were 35 upsets in the data set, and the average tempo of those games was 67.77, while the average tempo of the games in which the underdog lost was only 64.93. Using the statistical technique of a two-sample t test, we can find that we **reject the null hypothesis that there was no difference between the tempos of successful and unsuccessful underdogs**, and conclude that successful NCAA tournament underdogs played at a faster tempo, using more possessions. The P-value for this test stat was 0.0134, which is well inside the threshold for significance. This result directly conflicts with Oliver’s theory, and warrants further study.

Having seen that successful underdogs play faster, we attempted to determine how much of an advantage playing faster was. We ran a logistic regression on the dataset, with an underdog win coded as 1 and a loss as 0. The results were slightly less powerful than the t-test, as the P-value was 0.021, but again, were still statistically significant. **An underdog having an extra possession increased that team’s odds of an upset by 7.7 percent.**

What could be the reasons behind this rejection of the theory? One potential objection would be that teams that play faster in the tournament play faster over the course of the year, and that perhaps their faster tempo is in fact slower in the NCAA’s than during the season. However, the average adjusted year-long tempos of the successful and unsuccessful underdogs are practically the same (successful underdogs played on average .3 possessions per game faster), and successful underdogs played, on average, 1.4 possessions faster in their upsets than over the course of the season.

Perhaps playing faster helps level the playing field for smaller, mid-major teams who often do not match up with the frontcourts of the favored teams. Another reason could be that tempo of close games, which upsets often are, is artificially inflated by the fouling that occurs late in games. Similarly, the tempo of blowout games, which are games that favorites tend to win, could be artificially deflated by favorites using up the shot clock down the stretch.

We will definitely investigate this possibility in future posts. Frankly, however, we cannot think of a satisfactory explanation for why this data contradicts Oliver’s theory. Any thoughts or explanations would be appreciated in the comments.

OK, here’s my sort of stream-of-consciousness response.

– One thing jumps out at me: 67ish possessions isn’t exactly fast — it’s merely average. Wonder if that has something to do with it, since it still could be significantly slower than what their opponent wants to play.

– Not to quibble too much with your methodology, because it seems like a great starting spot, given that you were working from Oliver’s hypothesis that slow tempo alone should increase upsets. But I don’t think it gets to the real heart of the question.

How far below or above their opponents’ typical pace were the upsets played? Are upsets typically played at a faster or slower pace than what the favored team likes? That would seem more relevant, since I think we all agree that sports are as much about matchups as anything else.

– Also, it seems that lumping in all potential upsets together isn’t the way to go. An 8/9 is basically a non-upset because those teams are essentially viewed as interchangeable by the committee. Oliver’s theory seems to be predicated on games where there’s a wide talent gap — or, at the very least, you’d expect the variance to have more of an impact when there was a wide talent gap. Perhaps it might be more effective to analyze by seeding line?

I dunno. Just some ideas.

Nuss:

Thanks for the thoughts. I agree that the opponent’s average tempo is something I’d like to have. However there is a reason that 66.5 is roughly the average tempo in college basketball. I have a feeling that on average, favorites that win and lose will play at about the same pace.

With regards to the seeding issue: perhaps I didn’t make this clear enough in the post, but the 8/9 7/10 games were not included. What was included was 2nd round games with upset potential, so every 1 vs. 8/9 game and 2 vs. 7/10 game was included.

John and Andrew,

Fantastic post; really thought provoking stuff. Is there any data on possessions in the first 30 minutes of the game? That could eliminate the issues that come from running the clock or fouling at the end of games. If the first 30 mins isn’t available, even examining just the first half might lead to some useful insight.

Since the upsets are (presumably) a smaller margin than the wins, you’re right that we would expect many more fouling sequences during the upsets. The real question is whether this is driving the result or is it something else? By looking at just the first half, we may get an idea.

The theory about better teams winning the low post battles is interesting. Maybe the upset winners are those who play a very contrasting style (possibly more uptempo) and don’t try to beat the favorites at their own game.

Once again, nice work.

This is interesting. Three questions:

1) Are both distributions normal with similar variance? My guess would be that highly unusual tempos favor the upset. In particular, you might see a handful of outlying high-tempo games influencing the average.

2) Is it possible to break it down by number of possessions in the first half or in the first 30 minutes, to compensate for the effect of late fouling?

3) More random: Pomeroy has correlations between tempo and OE and DE under “game plan”. Did you look at all at whether there’s any difference in the ability to handle different tempos among teams who are upset and those who aren’t?

Jeff:

Thanks for the comment. In answer to 1), yes both distributions can be considered normal. Neither showed any pronounced skew nor did they have outliers. Additionally, the variances were almost identical; so close that I would’ve been justified in using a pooled variance t-test if I so chose.

As for 2), I think such numbers would be very informative, but short of combing through individual play-by-plays, the numbers are not available. Stay tuned for our next post about Margin of Victory and tempo (see Daniel’s comment above). We got some very interesting results.

Finally, I will take a closer look at KenPom’s game plan numbers, but I’m not convinced they are germane in a 1-game underdog/favorite situation.

I’d be interested in a far more narrow analysis here. I’m not sure what your stipulation for an upset was (given that the average for the six-year span was 24 games a year, I’d fathom a guess that you used a three-seed or four-seed gap as the lower bound for defining an upset potential game), but it would be interesting to know the definition you used.

I think it’s only worthwhile to consider the performance of Automatic Qualifiers seeded 12 or worse. At-larges in the 10-12 seed range typically come from major conferences and aren’t going to be concerned with the slow pace-high variance connection. They also happen to be the lower seeds doing more of the upsetting, so you’ve got some nasty reverse causality there.

Even without Pomeroy, you can go back through the years and use the thumbnail possessions calculation FGA-OREB+TO+FTA*.44 (calculate for both teams and divide by 2) to gather more data for your sample.

I think those two changes would yield a more interesting and more convincing result.