by Bill Lotter
Yesterday, we saw that we can predict future NFL success from the combine. We compared the prediction accuracy of our model to the implicit predictions made by the draft. Today we ask a different question: can we use the combine to predict the draft itself? Given our previous model, are teams evaluating the combine correctly?
I’ll give you a spoiler again. Does the graph below look similar to anything? If not, go ahead and read yesterday’s article, we’ll wait.
The model today is nearly exactly the same as yesterday, except, instead of predicting 3 Year Approximate Value (3YAV, our proxy for NFL success), we’ll predict a player’s actual pick in the draft. We will use a linear model, fit using ridge regression. The key concept is that we will have a coefficient for each combine measure and position, which says how that measure relates to a player’s draft pick. We will again be predicting a player’s percentile: this time, a player’s percentile in draft pick, instead of 3YAV. So, players drafted first overall will have the highest possible percentile, while undrafted players will have the lowest. Percentile is calculated separately for each position. While not 100% apples to apples, this formulation allows us to reasonably compare the two models.
The coefficients of the draft model are shown below, side-by-side with the coefficients for predicting 3YAV. A big difference between a pair of coefficients would suggest that NFL teams are, implicitly or explicitly, valuing this measurement differently than they should.
- The first thing that jumps out is the similarity in the betas for both predictions. Any differences are fairly small compared to the uncertainty in the estimates (as shown by the error bars).
- The biggest differences, although not statistically significant, are for OT: the draft has more emphasis on the bench press and less on the broad jump.
Offensive Skill Positions
Note: Again, the bench press isn’t used in the model for QB’s because not many QB’s do it.
- Again, notice the striking similarity. There are no statistically significant differences.
- The biggest differences in the group are for QB’s, where height is somewhat overvalued, and for WR’s, where the forty and bench are also valued more in the draft. Take this with a grain of salt given the error bars, but WR is the position with the highest draft bench coefficient. This could suggest that WR’s can expect the biggest jump in the draft by improving their bench.
Defensive Lineman and Linebackers
- Guess what? Similar again.
- Yup. Still similar.
At the individual coefficient level there are no statistically significant differences between the draft and 3YAV model. This is partly due to the high uncertainty estimates, but it’s still remarkable. Like yesterday, let’s summarize the importance of the combine in one plot.
The first plot is the importance for predicting the draft and the second is for predicting 3YAV. Can we say similar one more time? Although, the forty and bench columns do seem a bit brighter for the draft. We can quantify this by summing up over the columns.
The differences in the overall importance for forty and bench, obtained by summing over positions, are actually statistically significant. Weight is the only factor that has a higher importance in the 3YAV model than the draft.
Here’s the plot of overall combine importance by position.
The importance as determined by the draft tends to be higher than that for 3YAV. I’ll explain why in the next section. The ordering across positions, however, is pretty similar. Positions for which the combine is important for predicting 3YAV tend to be the positions where the combine is important for predicting the draft as well.
For completeness, here is the graph showing which variables are statistically significant in predicting draft percentile. Notice that the forty is significant for all positions.
Like previously, we want to go beyond statistical significance and estimate the real impact of the combine on the draft. The largest coefficients have values of about 5 percentile per standard deviation. Although it varies a bit by position/round, a 10-15 percentile jump is equivalent to about one round in the draft. So a CB that has a better forty time by a tenth of a second can expect to be chosen about a half a round earlier. And that is for one measurement alone. For someone on the border of going in the first or second rounds, this could mean millions of dollars.
We will assess model accuracy by using the Spearman rank correlation again. This measure tells us how correlated the actual draft order is to the predicted draft order based on the model.
The model can significantly predict draft order for all positions except for WR, which is very close to significance (p = 0.06). This concurs what we saw previously: we can’t really predict the success of WR’s on the field or in the draft using the combine alone. The model does the best at predicting the draft for CB’s, followed by OLB’s, TE’s, and SS’s.
Overall, the prediction accuracy for the draft model is higher than that for the 3YAV model. This means that it is easier to predict where players will be drafted than how they will actually fair in the NFL. We would expect this to be the case. Many different things can happen once a player reaches the NFL. Draft pick reflects expected success, whereas the 3YAV measures actual success, which has many sources of unpredictability. The overall higher prediction accuracy of the draft model can at least partly explain why the draft coefficients are often larger than those for 3YAV. The particular regression method we are using, ridge regression, tends to shrink the coefficients when the prediction power is lower and the data is noisier. (Note: this makes it even more significant that weight has a higher importance in the 3YAV model than in the draft model).
Yay, we did it. We can successfully predict future performance and draft pick from the combine alone. The comparable coefficients for the models suggests that NFL teams do a great job in incorporating combine data into their draft decisions. We need to be a bit careful about correlation versus causation, however. Teams might not be explicitly using the combine as our model does, but may be relying more on sources such as game film. It just might happen that the skills that are reflected in a player’s stellar combine performance are the same skills that led to success in his college playing days. So if a player happens to have a lucky day and perform uncharacteristically well at the combine, it doesn’t necessarily mean he will be drafted higher, because this luck won’t be reflected in his game film. And although the models are similar, there are some differences. Teams seem to be overvaluing the bench and forty to some extent, as well as undervaluing weight.
Whelp, this concludes the combine analysis for now. I hope you got something out of it and feel free to contact me at: lotter [at] fas [dot] harvard [dot] edu.
It was a really enjoyable series. Will you use the system to predict this year’s draft? An attempt at the first two rounds would be fun, and it’s likely most of those players worked out. Go ahead and pick WRs with a dart
Thanks! Yeah I’m going to try to do something like that this weekend. At least say who quantitatively had the best combine both in terms of predicting future performance and draft pick.
This is a really interesting set of articles. In addition to a couple of rounds of a predicted draft, I would be interested in seeing a predicted top five or ten at each position.
Has anyone looked into reviewing the accuracy of the forty times that the NFL uses to evaluate these players? I was a bit surprised when some of the forty times were coming in for this year’s running back class, especially when some of the simulcasts did not show the same result as the times would indicate. Langford’s simulcast against Demarco Murray showed Langford finishing first, even though his reported time was slower. A couple of simulcast of Melvin Gordon against Murray / Bell and Murray/ Hill showed Gordon closer to Murray than his time indicated. A review of the forty splits from this year showed this year’s class had slow first ten yard splits, even though their 10 yard through forty yard times were actually fast. Looking into this, the forty times are stated by hand timing, while lasers are triggered at the supplies and the finish. I tend to trust laser timing and wonder what the accuracy really is for the NFL forty time.
Did you end up plugging the Combine Results into your formulas? I’d be very curious the results if you’re willing to share. Thanks either way.
Great analysis on this. Read something similar from Jonathan Bales (Rotoworld) in the past. He found draft position contributed more than combine measurables which intuitively makes sense (higher draft pick, team will give more opportunities in immediate future). Does your draft position function into the 3YAV or is it just derived from combine and stats?