Earlier this week, we used data from Basketball-Reference to examine the NBA’s trend toward older coaches. Now, let’s look at whether coaches of different ages see different outcomes. To do this, we can create a model that predicts the winning percentage of a given team based on its coach’s age. The results of this model are as follows:

Coach age is correlated with winning percentage in a way that is almost significant at the 0.05 level. Because the estimated coefficient is positive, the model tells us that older coaches fare better and that a one-year increase in coach age is associated with a 0.09% increase in winning percentage. Given the model’s low R^{2} of 0.002, coach age is not a terribly informative piece of data, but the relationship is almost significant nonetheless.

There is, however, a major flaw in the model: survivorship bias. Old coaches who are bad are harder to come by because they’re typically fired before they can get old. As a result, the older NBA coaches who are left are disproportionately successful. For this reason, we should control for years of experience. When we do so, our model yields the following results:

Suddenly, we get a different story. Once you control for years of experience (which is a strong predictor of performance), the coefficient on coach age is actually *negative*. This relationship is significant, and it suggests that younger coaches fare better. Accounting for years of experience, each one-year increase in coach age is associated with a 0.135% decrease in winning percentage.

The optimal coaching candidate for NBA teams is someone with a blend of youth and experience. This model says that we’d expect the most successful coach in the NBA to be Doc Rivers, not exactly a kid at age 55, but who has amassed 17 years of experience in a relatively short amount of time.

Looking only at coaches in their first year—a subset that excludes successful, long-tenured coaches like Gregg Popovich—experience is still a significantly positive predictor of success, and age is still a significantly negative one:

One area of future analysis would be to see how teams led by young coaches differ stylistically from the norm. If, for instance, younger coaches are more likely to embrace analytics, we might expect their teams to shoot more threes.

For now, this much is clear: it’s good to be young, but it’s even better to be experienced.

]]>

What’s remarkable about Phoenix Suns coach Earl Watson isn’t simply how young he is. Yes, Watson was hired last February at the tender age of 36, and yes, he played in the league as recently as 2014. But what’s most notable is this: Watson isn’t even the NBA’s youngest coach.

That title goes to Los Angeles Lakers coach Luke Walton, who will be 36 until late March. In fact, young coaches seem to be sprouting up throughout the league, from Celtics wunderkind Brad Stevens (40 years old) to Cavs’ championship winner Tyronn Lue (39 years old). So young is Lue that he was once the victim of a ferocious block by his current player LeBron James. Thankfully, James has since blocked a shot or two in Lue’s favor.

With this in mind, let’s look into whether these cases are the exception or the rule. Are NBA coaches getting younger? To answer this question, we can use data from Basketball-Reference, which has information on coaching hires over time.

As the chart above shows, the average age of head coaches has generally increased since the ABA-NBA merger in 1976. But some of this trend is driven by the fact that most coaches are retained from year to year, and as those coaches each get one year older, the league average increases.

The better ages to look at are those of *new* coaches. Within this subset, “new” can take on two meanings. There are coaches who are new to the league entirely, like Earl Watson. But there are also new hires for a given year, a category that includes Watson as well as coaches like Tom Thibodeau, who was hired by the Timberwolves this year after a previous run with the Bulls.

When we check the first group—coaches who are new to the league altogether—we see an overall upward trend, though this time it’s not quite as strong or consistent as the relationship we saw for coaches in general. Let’s check whether this correlation holds for new hires, the group that includes experienced coaches in new positions.

While the relationship between season and average coach age isn’t as strong for this subset as it is for the overall coaching pool, it’s still positive and significant. With varying degrees of emphasis, these three charts point to the same conclusion: NBA coaches are getting older.

In the second half of this post, we will try to answer the more intriguing question of what effect, if any, a coach’s age has on his team’s success. Fans of Earl Watson’s Suns will have to wait in the meantime.

]]>You often hear television commentators compare teams to their bitter rivals to illustrate its current status. Over time, the nature of rivalries changes due to each team’s movement between divisions. Some teams (like Arsenal with St. Totteringham’s Day) often directly compare themselves to their bitter rivals.

I wondered if there was a correlation between the league position of a team and their rival from year to year (positive or negative). To do this, I compiled a list of the 20 biggest rivalries in English Football. I chose 16 of these rivalries from a 2015 Telegraph article and discarded 4 on the basis that one or both of the teams involved had spent considerable time out of the Football League. I then picked 4 additional derbies (Manchester United-Liverpool, Tottenham-Chelsea, Manchester United-Leeds, Preston-Blackpool) out of my own personal curiosity.

I then calculated the correlation between the league positions from the 1958-59 season through to the 2015/16 season. If a team finished 5th in the Third Division, their league placing is defined as

*5 + # teams in First Division + # teams in Second Division*

The 1958/59 season was chosen as the start date because that was the first season after the Third Division split from being North and South into a Third and Fourth Division. The correlations between each of our rivalries are below.

A few things stand out:

**There was no strong correlation to be found, suggesting that neither a positive nor negative relationship exists**

Between the 20 derbies, there was an average correlation of .08 with standard deviation of .28. Some derbies showed some small level of positive correlation while others showed small level of negative correlation. In the end, there isn’t enough evidence to conclude that there is a level of correlation on the whole.

**The Media have got it slightly wrong when portraying the Chelsea-Tottenham rivalry**

In recent years, whenever Chelsea have played Tottenham Hotspur, the media often spins the familiar narrative along the lines of “Spurs used to be the more successful team, winning with stylish football in the 1960’s, but in recent years Chelsea have overtaken them with spending power and won many trophies of their own in the last 15 years, sometimes at Tottenham’s expense.” This would imply that there is a negative correlation between Tottenham and Chelsea’s league positions but the data suggests otherwise.

**The same narrative also doesn’t hold up with the Liverpool-Manchester United rivalry**

The media also have a similar party line when referring to the North West derby between Liverpool and Manchester United. Whenever the two meet, the media often reminds everyone that Liverpool were the team of the 1970s and 80s, winning 11 titles in 15 years in addition to 4 European Cups, but Sir Alex Ferguson at Manchester United was determined to knock “Liverpool off their perch” and since Liverpool’s last title in 1990, Manchester United have won 13 titles. This also would imply a negative correlation, but like with Chelsea and Tottenham, our data shows that there is almost no correlation between the two clubs’ league positions.

**The correlation between Swansea and Cardiff is likely due to their unique position in English Football**

The only derby we had that showed a medium level of correlation was the South Wales Derby between Swansea and Cardiff. The two clubs are the only 2 clubs from Wales that have regularly played in the English Football League system and are therefore are subject to the same challenges with regards to assembling their teams. Any change to the Anglo-Welsh relationship affects both of them equally and no one else so it makes perfect sense that this rivalry would have the highest level of correlation.

]]>There’s nearly nothing America loves more than its annual 5-month dosage of football, topped off by the most-watched television event of the year: the Super Bowl. The Super Bowl offers a unique flavor to sports that most professional athletic finales don’t: whereas in many American sports, a best-of-seven series usually undercuts an underdog’s chances at pulling the rug from under the favorite, in football, only 60 minutes separates each team from everlasting glory. The frequent irreplaceability of star players in critical positions and the game’s fast-paced momentum swings make an upset much more likely. Typically, however, these notions are built off people’s intuition regarding the sport and its biggest stage; what exactly does data surrounding the Super Bowl have to say about underdogs’ chances of pulling an upset? What does it say about the chances of this year’s 3-point underdogs, the Atlanta Falcons, in their quest to overcome the perennial contention of Belichick’s Patriots?

Betting-line data (per VegasInsider.com) can be used to estimate the perception and existence of a favorite in each of the 50 Super Bowls. This data reveals that a respectable 30% of underdogs have won the Super Bowl; in fact, it shows that while games won by favorites are decided on average by 7.83 points, underdogs win by a much larger average margin of victory of 20.17 points. Delving more deeply into the 15 underdog victories reveals that 9 of the victories have come in games with a favorite of 5 or fewer points. Indeed, in the 21 games with a favorite of 5 or fewer points, 8 have ended with the underdogs climbing the podium to accept the Lombardi Trophy. That 38.10% bodes very well for the Falcons, as they have a fairly significant shot at winning the league.

Recent data gives them an ever better shot at winning the Super Bowl. Starting with Super Bowl XXXIV in 2000, underdogs have won 7 out of 17 Super Bowls, and in the 9 games with favorites of five or fewer points, the underdog has pulled out a win in 5. In addition, the NFL/NFC teams in the Super Bowl have emerged victorious 26 times, while AFL/AFC teams have won 24 times. In essence, none of this data provides any significant support for a Patriots victory; the game truly is up for grabs. The plotted data for betting line vs. final score difference is fairly uncorrelated, for even though there is a general tendency for favorites to (quite fittingly) win the Super Bowl, the relationship between predicted score differential and actual differential is incredibly scattered. Particularly in the case of games expected to be close, the relationship is even more fuzzy.

The Super Bowl then takes on the feel of a “who wants it more” finale as opposed to a single chapter of a grueling 7-game series that almost seems fated to end with the “more qualified” team holding the trophy. Perhaps it’s not that simple; there certainly is the chance of less visible trends and factors under the surface-level data that tend to affect the outcomes of the games. However, the relative shortage of any apparent trends in Super Bowl when contrasted with the finales of other sports helps to explain the game’s unmatched popularity.

]]>Transition buckets are a huge element of success in college hoops. About a quarter of shot opportunities from 2014-2016 in college basketball came on the break, with eFG% in those opportunities rising from 50.2 percent in half-court sets to 54.2 percent in the first ten seconds of possessions. It makes sense, then, that teams should have an incentive to get back on defense once shots are launched. As a Syracuse fan, I know that every time the Orange can stop the ball in transition and force the opponent to play half-court is a small victory.

There’s obviously the other side of this argument—that an offensive board has some sort of inherent value. Offensive boards often lead to easy putbacks, and at the very worst, a reset on the possession if the big is forced to pass the ball back up top.

I have data from the past three full NCAAB seasons, courtesy of Hoop Math’s Jeff Haley, of teams’ offensive and defensive rebounding stats on shots at the rim, jumpers, and three-pointers. I also have data on how often teams shot in transition off of defensive rebounds, which is available on the site itself.

I wanted to see if there was a correlation between a team’s aggressiveness on the offensive boards and a team’s proclivity of giving up easier shots in transition. Measuring a team’s aggressiveness on the boards is tough to do quantitatively, given that it is a relatively subjective measure. However, the way I imagined this (partially inspired by John Gasaway’s article on a similar topic a few years back) was to use defensive rebounding as a measure of a team’s rebounding strength, and the difference between a team’s success on the offensive boards versus defensive boards as a measure of a team’s aggressiveness on the offensive glass. If a team has a NCAA-average DRB% but kills it on the offensive boards, it’s likely that they’re placing a bigger emphasis on it as opposed to other teams. Essentially, I treat offensive rebounding as if it requires the exact same skillset as defensive rebounding. This probably is not 100 percent true, but without any tracking data, it’s tough to say otherwise.

I first calculated a team’s “real” ORB% and DRB%. Because the data Haley provided me was stratified by shot type, I calculated a team’s expected ORB% and DRB% (which I hereby refer to as eDRB% and eORB%) based on the shot type that they took and let up, respectively, compared to NCAA-average. This is motivated by the fact that ORB% and DRB% on different shot types are vastly different. The average on the aforementioned three shot types over these three years was as follows:

I then found the difference between a team’s actual ORB% / DRB% and its eORB% / eDRB% to find out a team’s relative comparison to NCAA-average. This is to say that if a team’s ORB% was 30%, and its eORB% was 29%, it would receive a +1% rating. I then standardized these quantities across all teams for all three seasons (I’ll refer to this as addORB% and addDRB%)

Next, I created a general additive model that modeled the percentage of fast-break opportunities that came off a shot against a team’s addDRB% (its rebounding skill) and its addORB% (its tendency to crash the offensive boards.

The output of the GAM showed that both variables were significant in predicting transition opportunities for opponents. The plot of the coefficients for addORB% and addDRB% are as follows (addORB% on left, addDRB% on right):

As you can see on the right, the better a team is on the defensive boards (the measure of skill), the less likely they are to allow transition baskets. However, the more interesting graph is on the left. The graph says that teams that retreat and teams that aggressively crash the glass tend to give up fewer transition opportunities, whereas teams that moderately try on the boards give up about 1 percent more opportunities in transition. This, in a way, makes sense—retreating obviously leads to less transition, but being really good may force defensive teams to allocate more guys to the glass, limiting outlet opportunities for teams in transition. The drop for higher addORB% could also arise as a function of those teams being better at offensive rebounding than defensive rebounding. However, if it’s not, it seems best to be aggressive—not only will you increase your odds of getting a board, but you won’t necessarily be risking too many easy transition baskets.

This doesn’t definitely answer the question of whether or not to crash the glass—it’s tough to answer that without tracking data. A similar question for the NBA was asked at the Sloan Sports Analytics Conference, and it was also suggested that it may be worth it to focus more on offensive boards than immediate retreat. But it seems like in the NCAA, aggression may be the best option.

]]>Over the past few years, Isaiah Thomas has transformed from a role player into an obvious All-Star candidate. He is having a transcendent offensive year, and he has been particularly impressive in the 4th quarter. But in our analysis of the point guards, we came across something very interesting. Isaiah Thomas’s strong offensive season is being partially offset by his subpar defense. To be more specific, Thomas has a 9.1 offensive BPM, which ranks second among players who have played 15 or more games, and a -4.0 defensive BPM, which ranks dead last. This is a remarkable spread, but just how unusual is it?

To quantify this disparity, we can use BPM, a box score statistic that dates back into the 70s on basketball-reference. Looking at all the seasons since the ABA/NBA merger (1976-1977), we can rank all player seasons (minimum of 20 games played) by the absolute value of the difference between their offensive BPM (OBPM) and defensive BPM (DBPM). Here are the top 15:

As we can see, Isaiah Thomas in 2017 is second only to Darko Milicic under this metric. But, since OBPM and DBPM have different means and standard deviations, this may not be the best way to quantify this disparity. Alternatively, we can account for this by taking the z score of every player’s seasons (using every OBPM/DBPM season with more than 20 games as a reference distribution), and ranking seasons by absolute value of the difference between the offensive and defensive z scores. Here are the top 15 seasons under this new metric:

Now, Isaiah Thomas has moved to the top of the list, as his seasons are more of outliers relative to the mean. It is interesting to note that most of the players in a similar stratosphere under these two metrics are centers, who are solid defensively but are a massive negative offensively. In the final graphic here, we can see OBPM graphed versus DBPM:

Here we can see just how far Thomas is from every other player. There seems to be no comparable season with remotely similar attributes. While he is playing very well offensively, it seems unfair to not also mention his defensive struggles when evaluating his overall impact as a player.

]]>It is Tuesday January 7th, and the Minnesota Timberwolves are playing the Utah Jazz in a less than marquee matchup between two small market teams. The Wolves lead for most of the second half, but as time winds down at the end of the game, they find themselves trailing by 2 going into the final possession. With 28 seconds left, Minnesota calls a timeout and draws up a play to get Karl-Anthony Towns the ball, who launches an 11 foot 2-point attempt that bounces out. Luckily, after a tie up for the rebound, the Wolves recover the jump ball in the backcourt with 15 seconds left on the clock. Here Zach LaVine dribbles the ball up slowly and then waits until the final seconds of the game, where he drives in and attempts a contested 19 footer to tie the game. He misses, and the Wolves let another one slip away.

You can hear Wolves fans groan throughout the arena. This type of shot rarely goes in, and it seems like Zach LaVine has made another immature decision. But, upon further inspection, there may be more than one thing wrong with this decision. Following the game, famed basketball gambler Haralabos Voulgaris expresses annoyance with the Wolves game plan.

If you are gonna run the clock out and take the last shot when down 2, at least take a 3. Taking a 2 was absurd.

— Haralabos Voulgaris (@haralabob) January 8, 2017

The obvious inefficiency here is the the 20 seconds that Zach LaVine wastes, taking a quick shot allows for time to foul if he misses. But Voulgaris is picking up on a more subtle critique here: the Wolves should be looking for a three, not a two. While a two is more likely to go in and thus extend the game, a simple expected value calculation suggests that a three point attempt may be more favorable. With time expiring in the game, shooting a three at a league average rate of 36% probably yields roughly a 36% chance of winning the game. However, shooting a two at the league average rate of 50% yields only a 25% chance of winning because overtime is probably close to a toss up.

This is a nice theory, but we can put it to the test using actual game data.

To fully understand this scenario, we can scrape all the instances of teams shooting while down two points in the last 20 seconds of a game, going back 10 years, from Basketball Reference. For each shot, we are interested in seconds remaining in the game, shot distance, point value of the shot, whether or not the shot is made, and whether or not that team won the game. There is one thing to be careful of here, the heaves taken in the last few seconds of the game do not fit the criteria we are interested in (as it is the team’s only option). To best combat this, we will limit the analysis to shots within 27 feet of the basket. This gives us a data set of 1536 shots since the 2006-2007 season. This analysis does not include fouls, as Basketball Reference does not record where they take place. Because of this, perhaps shots with a high foul frequency, such as layups, are slightly undervalued (although refs probably call significantly less fouls in the final twenty seconds of a game).

The first hypothesis to put to the test is the idea that holding onto the ball negatively impacts a team’s chance of winning. Below is a graph of two point and three point field goal percentage versus time left in the game. The graphs in this article are made using moving averages along the x axis, since the data set is small and very noisy.

From this graph, we can see the negative effects of holding on to the ball. While this analysis does not consider the time that each team has to set up their play, it seems abundantly clear that field goal percentage plummets as the game comes to a close. There are a few things to note here, though. First, field goal percentage for twos and threes is not significantly different with about 10-20 seconds left on the clock. Since a three seems so much more useful here, there may be some hidden value. Second, three point percentage plummets much quicker than two point percentage between nine seconds and three seconds remaining in the game. Finally, field goal percentage bottoms out with between two and zero seconds left in the game, representing hurried and highly difficult shot attempts. For this reason, we will split the data up into three time frames.

**Time Frame 1: The Last Ditch Shot Attempt (0-2 Seconds Remaining in the Game)**

Below is a graph representing how shot attempts with less than two seconds on the clock vary by distance. The green line represents the league average here from 2001-2015 for reference (big thanks to Matt Goldberg for help with retrieving this data).

The takeaway here seems to be that unless you are getting a shot at the rim, efficiency is incredibly low. It is interesting to note that teams are hitting threes at about the same rate as most mid range shots, so if there is no way to get to the rim it may be best to settle for a three pointer. This would support the argument that Zach LaVine’s 19 foot jumper was a flawed decision. It seems that NBA teams know this for the most part though. As we can see, most of the shot density is concentrated at the rim and the three point line.

We can also look at the splits between the two point and three point attempts, breaking it down by shooting percentage, winning percentage for the team that attempts the shot, win percentage given that the shot goes in, and number of shots taken since the 2006-2007 season.

Shooting % | Win % | Win % Given Make | Sample Size | |

3 Pointer Attempted | 18.75% | 19.23% | 94.87% | 208 |

2 Pointer Attempted | 30.19% | 17.86% | 48.38% | 308 |

As seems intuitive, teams almost always win after making a three, and win about fifty percent of the time after making a two. It seems that three pointers are slightly favored, but it is probably advantageous to steer away from this situation if possible.

**Time Frame 2: When The Defense is Hedging Against a Three (3-9 Seconds Remaining in the Game)**

Here we can run the exact same analysis, switching the time frame. Below is the shot selection graph.

Here, we start to see a little bit of value in the midrange. From the field goal percentage breakdowns, it seems likely that defenses are very scared of a three point attempt in this situation, and as a result are allowing midrange and long twos at a roughly league average rate. Again, we can look at the splits between two and three point attempts.

Shooting % | Win % | Win % Given Make | Sample Size | |

3 Pointer Attempted | 21.97% | 18.94% | 65.52% | 132 |

2 Pointer Attempted | 41.40% | 21.87% | 37.32% | 343 |

While the three point percentage has not increased much in this time frame, teams are hitting two point attempts over ten percent more frequently. Here we see that the two point attempt is favored by roughly three percentage points in win probability. We can also see that NBA teams are generally aware of this, shooting almost three times as many two point attempts as three point attempts in this time frame.

**Time Frame 3: Everything is About Equal (10-20 Seconds Remaining in the Game)**

Again, we can look at the graph of shot selection and tease out a few patterns.

Here, we see the three point percentage rebound to normal. Teams are not as scared of a three if they have ten or more seconds to go down on the other end and score a game winner. Breaking it down by two and three point splits yields the following chart:

Shooting % | Win % | Win % Given Make | Sample Size | |

3 Pointer Attempted | 32.10% | 27.16% | 65.38% | 81 |

2 Pointer Attempted | 40.86% | 26.86% | 37.76% | 350 |

This chart may seem a little odd at first. Teams hit three pointers only a bit less frequently than two pointers, and win almost twice as frequently when they go in. So why do teams only win slightly more off of a three point attempt? The explanation for this is offensive rebounding rates. Teams that miss their two point attempt go on to win the game 28% of the time, while teams that miss a three go on to win 19% of the time. This is a drastic difference, given that most of the shots are misses. The one caveat here is that better shooting teams will see an outsized positive impact of shooting a three versus shooting a two, as rebounding rates become less important.

So, what is the takeaway?

As you probably already knew before reading this article, there is a lot of value in getting to the rim in the last few seconds of the game. Not only do these shots have a much higher field goal percentage, but the teams rebound these shots at a much higher rate. It is also interesting to note how obviously we can see game planning against the three in the second time frame. Still, there is a lot of analysis left to be done on late game strategy, and hopefully the rest of the season and playoffs will add a few more events to the data set.

]]>

On Saturday, the Houston Texans entered Gillette Stadium as massive underdogs. As 16-point ‘dogs, the Texans were the fourth-biggest underdogs in NFL playoff history. Every trend and stat suggested that the Patriots were an unconquerable behemoth, from the rather simple “The Patriots are 9-1 in the Divisional Round with Tom Brady” to the more complex, Aaron Schatz’ level of the Patriots being the best team per DVOA, with the Texans between Jaguars and Niners at sixth-worst in the NFL. That, and the fact that Brock Osweiler was a historically bad quarterback who averaged 5.8 yards per attempt—last in the league among QBs with 300 attempts by 0.43 yards per attempt.

Yet, to an unaware observer of the game—although, I’m not sure how long it would take someone who hasn’t watched football before to figure out Brock is an awful quarterback—it didn’t seem as if the Texans thought themselves underdogs. As Bill Barnwell puts forth very eloquently in his Sunday column, Bill O’Brien had many chances to take a shot on fourth down and try to squeeze the most out of the field position they had. They kicked a FG on 4th-and-4 on the NE 15, 4th-and-4 on the NE 28, and on 4th-and-3 on the NE 9, and punted three times in New England territory. Houston seemed like a team content to play a low-risk strategy and hope that Tom Brady would make mistakes at home. I’m not an NFL coach by any stretch of the imagination, but I’ve watched enough football to know that strategy doesn’t bear fruit very often.

In March Madness, it’s pretty accepted that teams who are underdogs alter their strategy somewhat to increase their odds of beating a Goliath. Between slowing down the tempo to limit the amount of possessions and thus increase variance in the outcome of the game, to shooting more three-pointers to do the same, NCAAB underdogs often realize that they are such. Yet, despite having been blown out by Jacoby Brissett in Week 3 by 27 points, Bill O’Brien acted as if he was the favorite in this game—no altering of strategy and no special teams trickery.

But as much fun as it is to pin blame on Bill O’Brien—like how he stayed in Waltham for the Week 3 matchup despite knowing that Patriots’ pre-game traffic is god-awful—this is a plight of all NFL coaches. Rarely do they admit that they don’t expect to win, and rarely do they act like an underdog. Another example of this is Ben McAdoo’s coaching last week. Coming in as 5-point underdogs, McAdoo chose to kick a FG on 4th-and-3 on the GB 8 in the first to open the scoring for both teams, and on 4th-and-4 on the GB 22 up 3-0 in the second quarter. Early leads often cloud the minds of coaches, while the Giants probably should have been aggressive in those situations or just looked over at the other sideline and seen Aaron Rodgers there and thought, “Hey, we might not win this game 12-7.”

As a side-note, I think the one coach good at admitting that his team is the underdog is Mike Tomlin. As I mentioned in an interview with the Pittsburgh Post-Gazette, despite being up 6-0 and 12-0 to the Cowboys in November, Tomlin went for two in both of those situations against the NFC regular-season champions, not letting early success in a game lead to overconfidence. However, not going for a touchdown on 4th-and-1 on the KC 4 yesterday was slightly questionable.

I wanted to see if a team’s status as underdog influenced play-calling early in the game before the game situation starts to dictate all calls. So, I looked at all fourth-down decisions in the first three quarters of the 2015 season (since I had line data for 2015) with the hope of creating models that would predict a team’s decision on fourth down. I split the data into two separate frames—all plays from OWN 30 to OPP 45 and calls from OPP 30 to OPP 0. I did this to make two separate models to predict whether a team would a) go for it or punt, and b) go for it or kick a FG. I ignored plays from the OPP 45 to OPP 30 because play-calling there is conflated with kicker strength and weather, data which I was not able to compile.

Field Goal or Go For It?

The variables I considered in this model were field position, yards to go to a first down, time remaining in the game, the score differential, and the betting line as a proxy for relative team strength. I tried two different models: an additive model (stepwise logistic regression) and a non-additive one (decision tree).

The logistic regression yielded the following coefficients:

As you can see, line was not a significant predictor of a team’s aggressiveness in play-calling when choosing whether or not to kick a FG.

The decision tree (after pruning based on cross-validation error) yielded the following classification system:

The only distinction the tree was able to make (after pruning) was that it would predict team’s would go for it on 4th-and-1, and kick a FG otherwise. As you can here as well, the line didn’t seem to add any predictive power to modeling fourth down play-calling.

Punt or Go For It?

Here is the logistic regression model, which as you can see, does not include the line as a predictor after stepwise reduction.

Here is the decision tree after pruning, which also does not include the line as a predictor.

It’s pretty apparent that a team’s relative standing to the other team does not factor into play-calling on the whole across the NFL early in NFL contests.

There’s a popular mantra that “the odds are 50-50, it either happens or it doesn’t,” and, besides Tomlin, it doesn’t seem like many other coaches are willing to question traditional knowledge.

]]>Basketball is considered by many to be a streaky and psychological game. Analysts and fans use terms such as “hot” and “cold” to describe shooting performances. But do players get in their own head? After they miss a shot, are they inclined to defer for a few possessions to rebuild their confidence? Conversely, do players build confidence when they make a shot, and look to continue their hot streak on the next possession? A logical way to test this is to look at what a player does the possession directly after a shot attempt. If there was some sort of psychological effect from a miss, players would be less likely to call for the ball and shoot again after a miss than after a make. In this blog post, we will explore this phenomenon in depth in the NBA.

The first step in this analysis is to gather play by play data and box scores from Basketball Reference, for every game played in the 2016-2017 season up until January 9. After that, we then can be narrowed down the play data to just field goal attempts. Finally, we can create a matrix indexed by player name and the team they are on, and chart the result of each shot, make or miss, recording whether or not the same player took the team’s next shot. This gives us the data set that we will analyze.

The first question is whether or not we can see significant evidence of a difference league-wide between the chance that the shooter stays the same after a make and after a miss. The null hypothesis here is that players have the same chance of shooting again after either a miss or a make, and the alternative hypothesis is that the chance increases or decreases based on the result of the previous shot. The league average comes out so that the same player shoots again 22% of the time after a make and 16.5% of the time after a miss. A two sample t-test yields a p-value of 2.2e-16, which is means this difference is highly significant, and there is a very low probability that it is due to random variation.

It is worth noting here that this specific conclusion is not original analysis, it has already been demonstrated here by HSAC alumnus John Ezekowitz. For the rest of this article, we will try to further his analysis by considering how this effect manifests in individual players and teams.

So, how does this apply to individual players?

This graph gives us insight into four measured variables. On the x and y axes, we have the chance that a player takes the next shot after he makes versus after he misses. The color of the bubble is the difference, or “spread,” between the two values (make minus miss). And finally, the size of the bubble is number of shots that the player has taken in the season up until this point. This graph is restricted to players who have both made and missed more than 50 shots up until this point in the season.

A few things to note. Most of the players in this chart exist above the y = x line which means that they are more inclined to shoot after a make than a miss. We also have a group of players that exists pretty far below the line, including players such as Willie Reed, Rudy Gobert and DeAndre Jordan. This group highlights one slight flaws in this analysis. Tall centers are fairly likely to get their own rebound on a close layup and try to put it up again. This does not play into our intuition of the hot hand, because they are nearly forced to put the shot up again as it is obviously the best team option. Still, despite this anomaly, the overwhelming majority of players are subject to this “hot hand” effect.

In addition to this chart, we can look at a list of players ranked by the difference between their probability of taking a shot after a make and probability after a miss, or “spread” for short.

When organized by spread, the point guards and shooting guards are more likely to be on the top half of the list, and the power forwards and centers are more likely to be on the bottom half of the list. One possible explanation was mentioned above, that taller players are more likely to get their own rebound and put it directly back up without restarting the play. The other theory is that smaller players shoot a higher number of jump shots, which could potentially have more of a psychological effect due to higher variance.

The player analysis is clearly going to be subject to a fair amount of random variance. An interesting extension would be to check and see if players were highly correlated year to year under this metric. Also, we are not accounting for quarter changes and when a player subs out, so this data could be cleaned up further.

Once we have established that this is a significant phenomenon, and broken it down by individual players, the final point that we will explore is the distribution of teams.

Every single team is positive under this metric, but to varying degrees. Teams like Cleveland and Oklahoma City have a high chance of the same player shooting after both a miss and a make, probably due to fewer players dominating the ball. The full team results are as follows:

It is interesting to note that the top half of this list is composed of stronger teams than the bottom half. It is not a perfect correlation by any stretch, but it is strong enough that there is probably some underlying cause.

From this analysis, we get an interesting insight into how skewed the league is on this, and which players and teams suffer from it the most. With the current literature suggesting that the hot hand does not exist or exists to a small degree, this analysis highlights two possible inefficiencies in the NBA. First is that a team may be potentially forcing the ball to someone they think is “hot” over someone who has missed a few shots, even when they are guarded more heavily. Conversely, teams on defense could be shifting toward players like John Wall after a make, and shifting away after a miss, as his shot selection is greatly influenced by his previous shot. Regardless, it seems that NBA players are not perfectly rational actors.

]]>The two-minute drill is one of the most exciting sequences in the NFL. Success in crunch-time can launch Hall of Fame careers, while failure to convert can aid in your becoming one of the most maligned figures in all of sports. Whether it’s fair or not, quarterbacks are heavily judged by their ability to lead offenses with little time on the clock. Every second, therefore, is crucial to a quarterback’s reputation.

It makes sense, then, that teams sometimes spike the ball to stop the clock. This may be one of the least exciting plays in the game, even if its fakes are some of the most exciting. But at its core, it’s a trade: a down for some extra time on the clock and less chaos when lining up at the line of scrimmage. With less time on the clock, the marginal utility of an additional down dwindles, so having two or three downs as opposed to three or four, respectively, is not that big of a deal.

However, there’s an alternative argument: the defensive chaos may outweigh the offensive chaos, and, by extension, an offense’s next play might be more effective in the no-huddle than it would be if both teams got recollect.

To look at this problem, I used NFL Savant play-by-play data from last season and this season though the end of November. To avoid meaningless drives at the end of blowout games, I only looked at the last two minutes of the second quarter. I then filtered through the data to find instances where teams may be inclined to stop the clock, looking at plays on 1st or 2nd down between 2:00 and 0:10 (to filter out spikes to set up field goals) where teams opted not to huddle. In total, I found 421 times where teams decided to no huddle, and only 29 times where teams forfeited a down and spiked the ball.

There are two main questions when it comes to spiking: how much time are you saving by spiking, and is it more effective to reset or keep the defense on its heels?

The first question is simple enough. It’s simply looking at the average length of a play leading up to a spike versus leading up to a no-huddle play.

Time From Previous Snap to Spike | Time From Previous Snap to No-Huddle Play |

15.7 seconds | 19.3 seconds |

It turns out that this is indeed a statistically significant difference. So, essentially, a team is saving about four seconds of game time and a down by spiking the ball. Is that worth losing a down?

To look at this particular part of the problem, I ran a relatively simple regression. For spikes, I looked at the yards gained on the subsequent play, while, as is common sense, I looked at yards gained on the actual play for no-huddle plays. For my predictors, I looked at the yard-line on which the play took place, how many seconds were left in the half (as a gauge for aggressiveness), and whether or not there was a spike beforehand, and all three possible interactions. I then performed stepwise regression, which yielded the following model:

As you can see, the remaining variables are the yard-line and whether or not the play came after a spike. While the two variables don’t explain a lot of the variance in yardage, the two variables are still significant. And, interestingly enough, the spike term is negative, which is to say that a team, on average, gains more field position-adjusted yards in the no-huddle than it does after a spike. The shortfall of this analysis is that there haven’t been a lot of spikes to stop the clock that don’t come before setting up field goals. So, the magnitude of the coefficients is not very informative. However, it is nonetheless noteworthy that, even with a small sample, teams tend to do better when they catch the defense retreating rather than when the offense takes it time and is able to line up more precisely. Saving three seconds, then, may not be worth it when you both have more effective plays and don’t lose a down.

While Gronk should keep on spiking, maybe teams shouldn’t let the defense off the hook and stop the clock that often.

]]>