After a poor start to the season compared to lofty preseason expectations, the Boston Celtics walked into the United Center on Saturday night riding a 4 game winning streak and feeling confident about beating the woeful 6-20 Chicago Bulls. However, they would not have expected to leave the Windy City on such a high after recording a franchise record 56 point, 133-77 victory.
This got me wondering, how likely was this victory? To go about trying to figure this out, I decided to try fitting a model using the 538 dataset that includes every game in NBA history, as well as the home and away team’s pre-game ELO score. The data used was regular season, non-neutral site games from the 2012/13 to 2017/18 seasons. I decided to try to build a Double Poisson model to simulate a score distribution for the game, which is very similar to some models used to forecast English Premier League soccer matches. For those unfamiliar with the Poisson distribution, it is used to describe or model occurrences that are considered to be “count variables” (such as number of goals in a soccer game, number of points in a basketball game, number of defective products created by a company), that can only take on non-negative integer values. However, the scores in an NBA game show a large level of correlation with one another. The correlation between the home and away team scores in our 7367 games was a rather large .38 (it’s around .1 for soccer). Thus, we needed to attempt a more sophisticated model.
The model that was decided upon was a bivariate Poisson.This model, described in the following 2014 research paper by Ke Shen, allows for correlation between the two Poisson distributed variables. The only features we considered in this model were the 2 team’s ELO’s and home court advantage.
After fitting the model (R output at end of article), we simulated the game 1 million times to determine how many times the Celtics came out on top by at least 56 points. Our pre game ELO for the Bulls was 1322 and 1591 for the Celtics. We found that in our 1 million simulations, we found the following distribution of game scores:
Based on this analysis, we find that the probability of a 56+ point margin in this particular game played between the Chicago Bulls and Boston Celtics on December 8, 2018 at the United Center was incredibly unlikely. However, there are 1,230 regular season games in an NBA season which means that in theory, there are many opportunities for a game to be as lopsided as this. As a result, we decided to simulate the last 6 NBA seasons 100,000 times to see how often a team won by 56+ points and to map out the distribution of the difference of the expected number of times in a season we would see certain outcomes in an NBA game.
We found that in our 100,000 simulations of the last 6 NBA seasons, there were a total of 14,753 games with a magnitude of greater than or equal to 56 points. We also find the following expected magnitudes of games in a season:
Here, we find much more reasonable probabilities. Based on this, we should expect to see a game similar to the one we saw on Saturday (50+ point differential) about once every 5 years.
In reality, from the 2012-13 season to the 2017-18 season, there were 4 games that were decided by a 50+ point differential (three by 50-55, one by 61), about 2.8 more than our model predicted. It would make sense that our model would underpredict extreme outcomes, because it does not take into account the nature of a game changing during garbage time, which is generally when these ludicrous final scores are reached.
In summary, the Celtics victory over the Bulls on Saturday was astounding, and it’s something we should expect to see about once or twice a decade in the NBA.
If you have any questions or comments about this article, please feel free to reach out to Andrew at andrewpuopolo@college.harvard.edu.
R Output For Bivariate Poisson: