By Anthony Zonfrelli
As I studied the predictive value of NFL combine events on the draft picks of offensive players, this post examines which combine tests NFL front offices tend to focus on when selecting defensive recruits. This analysis, coupled with Kevin Meers’s study of the predictive value of combine measurements on eventual Career Approximate Value (CAV) will help to see whether teams tend to place emphasis on the combine events that best translate into on-field performance. Like Kevin’s article, this study includes all players who were drafted from 1999-2010. Also like the first post, this study produced rather weak models, and the most accurate model only explains 23% of the variation in draft selection. Still, the defensive models are more predictive than their offensive counterparts. Here they are:
Defensive End: The model for predicting DE production was one of the best fitting models found in Kevin’s article, containing transformations of weight, 40-yard dash, and 3-cone drill. NFL scouts seem to have figured out the keys to finding successful defensive ends (as far as the combine goes, at least) by this regression.
Pick = 181.50*40-yard dash(seconds) -2.12*weight(lbs.) +64.80*cone drill(seconds) -667.54
This model does a fair job predicting draft pick with an adjusted R^2 of 0.17, a MSE of 68.85, and respective p-values of 0.000, 0.000, 0.006, and 0.004. As obvious as it sounds, big, fast, and quick defensive ends tend to make the best ones, and NFL teams have figured that out.
Defensive Tackle: Weight, shuttle time, and bench press repetitions were each significant predictors of on-field success, but front offices tended to focus on only one of those three.
Pick = 121.84*40-yard dash -2.45*bench press(repetitions) -443.41
With p-values of 0.004, 0.010, and 0.040 respectively, an adjusted R^2 of 0.09, and a MSE of 67.88, this model is rather weak. This result makes sense because the model for finding CAV only showed a weak relationship between the combine and career outcomes. In an effort to properly evaluate defensive tackles by more than their strength, teams tend to differentiate them with what appears to be their default measurement of athleticism (the 40-yard dash) instead of the combination of weight and shuttle drill. NFL teams may want to focus more on their defensive tackles’ size and quickness rather than their speed.
Outside Linebacker: The model for outside linebacker success was relatively weak, but it found that 3-cone drill and 40-yard dash were significant. NFL teams picked up on those two predictors, but they also tended to favor weight and broad jump.
Pick = 123.84*40-yard dash -2.06*weight -2.53*broad jump(inches) +61.24*cone drill
The p-values for these are 0.016, 0.000, 0.012, and 0.014 respectively. This regression finds a relatively strong relationship, with an adjusted R^2 of 0.23 and a MSE of 58.94. While NFL teams tended to reward speed and quickness, as they should, they tended to overestimate the need for size and explosiveness through their favoritism of weight and the broad jump. It appears that rather than size and power, NFL executives should find more agile linebackers, holding all else constant.
Inside Linebacker: The CAV model for inside linebackers also did a poor job at predicting linebackers success, yet somehow it found 40-yard dash to be significant, as did the bizarre model predicting draft pick.
Pick = -35304.72*40-yard dash +9873.35*40-yard dash^2 -775.74*40-yard dash^3 +133435.6
It isn’t the friendliest looking model, but there it is with respective p-values of 0.024, 0.025, 0.025, and 0.023, and adjusted R^2 of 0.11, and a MSE of 60.48. Kevin said it best in his original post with, “These combine measurements simply do not do a good job of predicting performance for linebackers.” Effectively, the only thing that can be taken from this model is that there is much more that makes a successful linebacker than the combine is able to measure, though speed may have a small say in it.
Cornerback: The model for success at cornerback included the 40-yard dash, weight, and the 3-cone drill. In this particularly accurate model for draft pick (adjusted R^2 = 0.22, MSE = 54.78), scouts tended to see the benefits of these measurements, yet substituted explosiveness for quickness.
Pick = 256.10*40-yard dash -1.51*weight -1.87*broad jump -525.25
P-values here are 0.000, 0.001, 0.014, and 0.047. NFL teams appear to evaluate cornerbacks rather well, though there’s evidence that they should favor more agile corners over the explosive or powerful ones.
Free Safety: The model predicting the CAV of free safeties is quite terrible, only being able to explain 4% of the variation in free safety success. Essentially, this tells us that the combine has no good way of evaluating free safeties for what they are really worth. This makes it interesting that certain combine events can explain over 10% of the variation in draft pick (adjusted R^2 = 0.10, MSE = 60.28).
Pick = 166.14*40-yard dash +80.75*cone drill -1202.97
With p-values of 0.041, 0.018, and 0.006, 40-yard dash and 3-cone drill times are significant predictors of draft pick. Since no metric indicating speed showed up in the model for free safety CAV, perhaps things that the combine can’t measure, such as instincts, awareness, and tackling ability, should get the emphasis that these speed and agility metrics previously received.
Strong Safety: In another odd model, strong safety draft pick was significantly predicted by 20-yard shuttle time with surprising strength (adjusted R^2 = 0.14, MSE = 66.16), instead of by weight and 40-yard dash time, like the CAV model suggests should happen.
Pick = 180.83*shuttle(seconds) -617.67
P-values of 0.003 and 0.013 make these terms significant. Whether they should be or not is a different question entirely. While NFL teams tend to look for quickness and agility in their strong safeties, they should focus more on speed. That said, the weak relationship between CAV and any combine events here suggest that there is probably no effective way to evaluate strong safeties.
Conclusion
As Kevin found, most NFL combine measurements are poor indicators of future success in the NFL. The models were consistently pretty weak. The models for draft pick, although still weak, tended to fit better than those predicting CAV. This finding implies that NFL teams are placing an emphasis on certain combine events that they believe to be proper evaluations of prospective players. Our results indicate that front offices use the combine most effectively to assess defensive ends, but otherwise tend to reward players with better scores and times in events that don’t end up predicting future success.
Not all of it is their fault. The major takeaway from all of this combine analysis is that the measurables from there just don’t have a substantial impact on either draft pick or career outcome, regardless of which side of the ball a player is on.
Again, for the second time YOU ARE DOING THIS WRONG.
I’m confused by your MSE interpretations – don’t you need to take the square root?
In a couple of your models you include the variable “weight.” You need to include an additional variable of weight_squared (lbs^2) in order to capture the true relationship. I say this only because it is logical to assume that there is an optimum-value for weight, where weighing any less would lower the dependent variable and weighing any more would also lower the dependent variable (think of a parabolic-like curve).
Simply ONLY including “weight” would likely produce inconsistent results.