by Bill Lotter
It’s a simple question. Does the combine actually matter? Or it is just an over-hyped meat market? In answering these questions, there’s two ways we could look at things: 1. Does performance at the combine relate to future performance in the NFL? 2. Even if the first answer is no, does performance at the combine correlate to where a player will be drafted? While it’s probably not too surprising that the answer to the second question is yes, believe it or not, the answer to the first question is yes as well. In a series of articles, I’ll show you why. And I’ll do so in the best, most unbiased way possible: using data and math. But don’t click away yet! There will be plenty of pretty graphs and I’ll save most of the nitty-gritty details for separate sections. Before I get into it, here is a list of the major points I’ll be making:
- From combine stats alone, you can significantly predict how well a player will do in the NFL. It’s not an exact science, but to be able to say anything about how good a player will be from 8 numbers, in such a complex game as football, is pretty amazing.
- The 40 yard dash is the most important drill at the combine. Really.
- The 40 isn’t the only thing that matters though. Weight and the 3 cone drill follow next in overall importance. In general, weight is more important than height.
- The bench press is highly overrated and is the least significant factor. That being said, it’s not completely useless. Besides DT’s, the bench press is most important for CB’s. And WR’s can expect the biggest movement up in the draft by benching well.
- In terms of predicting success, the combine is most important for DE, OLB, and CB. It’s least important for WR and FS. WR was the only position for which the combine didn’t have significant predictive power.
- In terms of having an effect on draft position, the combine is most important for C, OLB, QB, and CB. It’s least important for ILB and WR.
- NFL teams do a remarkable job in determining the importance of the different events per position. The factors that lead to actual success are the same that will lead to one getting drafted higher. But, NFL teams aren’t perfect. They tend to overvalue the forty and the bench.
Alright, now let’s start to dig in. The goal is: given a player’s combine performance, predict how good he will be NFL, as well as where he will go in the draft. How do we quantify NFL success? Ideally we’d just have one number that combines all aspects of a player’s performance. The Approximate Value (AV), created by Pro Football Reference founder Doug Drinen, is a stat that attempts to do this. It consists of a formula for each position that and is calculated on a yearly basis. The next question becomes, over what period should we calculate the AV? On one hand, you want to take in account the entire career of a player to fully reflect how valuable they were. But this leads to some obvious problems. First off, it would make it very hard to compare older players, who have had more years to accumulate AV, to younger players. Second, it can be counterproductive in terms of our main goal. We don’t want a journeyman, who happens to bounce around different teams and accumulate AV over time to be deemed better than a star who happens to have a freak injury and retires after 3 years (e.g. Bo Jackson). What we really want to know from the combine is, “Does he have what it takes to make it in the NFL?” or “Will his freakish abilities actually translate to the field?” With this in mind, I propose that what we should look at the total AV over a player’s first three years. This is what he hope to be able to predict when looking at combine numbers.
Now, obviously looking at the first three years has some drawbacks. Particularly for QB’s, you might argue that players need time to develop and you won’t know their value until later down the line. While this is a valid argument, the average career length is 3.3 years and by the end of the third year, you generally are able to say if a player has ‘made it’ or not. Extending the evaluation period only adds more noise, which you can’t predict, and gives you fewer and fewer players to look at. As a GM, you might as well stick with what you can predict and leave the freak injuries and career epiphanies to Vegas.
Okay, all that to say that I’ll be trying to predict the 3 year Approximate Value (3YAV). I’ll look at combine data from 2000 to 2011, which I got from nflcombineresults.com. The first step in any data trek should be to look at the raw data. Below I show some scatter plots of the 3YAV vs. combine measurements, grouped by position. The first plot is for the forty yard dash for offensive guards. Each dot represents a player and the coordinate gives his forty time and subsequent 3YAV.
While there is tons of variation, it’s cool to see a pattern that faster guards tended to have more NFL success. Compare this to the bench press:
Any trend looks pretty minimal to me. When you think of big, beefy OG, the last thing you probably think is that their speed is more important than their strength (although it’s very debatable whether max reps at 225 actually measures strength at all…). All this is very qualitative now, but we’ll quantify it later.
Below are a bunch of other interesting relationships.
We’re just scratching the surface of what is in this data, but already, there are some interesting things. Tomorrow, I’ll show how we can build a quantitative model to answer our questions.
I know the AV comes from pro-football-reference dotcom but how did you get the 3AV? I can’t imagine you manually tallied up the first three year’s AV.
Thanks.
I just downloaded the AV per year and then wrote a script to calculate it for each player in their first three years.
Cool, that makes more sense. R or Python (or something else)?
I used Matlab for this. But generally work in either Python or Matlab
Where do you get the year-by-year AV? I’m somehow missing that.
Same question as another poster. Where do you get the year-by-year AV? Since you looked at combine data from 2000 to 2011, I assume you then looked at the year-by-year AV for every person in that combine data set for their 1st three years. It appears you also looked at a lot of positions. That is a lot of players to look up by hand…I’m guessing 11 years * 100+ players per year for over 1100 players, each for 3 years, so 3300 data points. How did you gather the AV information?
I’m not sure this shows that the combine matters. Your post show very clearly that performance (and height/weight) at the combine is predictive of success in the NFL, but for it to matter then it would have to provide added value to the teams, on top of the film and interviews. It may be (and some would say it’s likely – I’m not sure where I stand on it) that NFL teams would draft at least as effectively as they do without the combine.
I’m not sure this is something that you can prove, unfortunately. This piece of work goes as far as you can with it, and does it well.
Yeah I hear ya. I talked a bit along those lines in the second and third posts. Overall, it’s related to correlation vs. causation, which is always difficult to ascertain. One way you could possibly test it is by looking at people who were invited to the combine but skipped out on a bunch of the drills and see if teams were better/worse at drafting them. But most certainly, this group would be biased and the data would be limited as well. Another question is whether or not it should matter. It could contain extra predicting power, but teams might not be using it correctly.