by Sam Waters
Last week, the Houston Texans signed running back Arian Foster to a five year, $43.5 million deal. Foster, an undrafted free agent out of Tennessee in 2009, was toiling in obscurity heading into the 2010 season. Five months later, he was the NFL’s leading rusher and a representative of the AFC in the Pro Bowl. Foster’s surprising rise to prominence begs for an examination of how efficient NFL teams are at evaluating running back talent. Of course, it is easy to cherry pick examples like Foster- instead, we can take a more systematic approach to analyzing running back evaluation by looking at the relationship between pick number and running back production in the NFL draft.
In order to do this, we look at all running backs drafted from 1994 to 2003 and measure their production in terms of Defense-Adjusted Yards Above Replacement (DYAR). This study reveals that draft position is a statistically significant predictor of future performance- if it wasn’t there would be no advantage to holding higher draft picks as opposed to lower picks. We also find, however, that pick number only explains about 16.5 percent of the variation in performance based on DYAR (R^2=.1651). This r-squared value shows that where a running back is drafted does very little to predict his future performance. In turn, since NFL teams choose where a player is drafted, the teams are doing a very poor job of predicting the players’ future performance.
Before looking at the draft data, it is important to understand how DYAR works and what it measures. DYAR is a stat developed by Football Outsiders that calculates the value of a running back relative to a theoretical “replacement player.” Recognizable examples of such replacement players include mediocre backs such as Lee Suggs, Ki-Jana Carter and DeShaun Foster. These players were essentially interchangeable assets over the course of their careers and added little value to their teams.
It is best to illustrate this concept with a simplified example. In 2011, Arian Foster amassed 278 DYAR. If Arian Foster was not on the Texans they would have had to give a share of his opportunities to “replacement level” players that would have supplanted him in the lineup. If, for instance, Foster can gain 500 yards in the same amount of opportunities (in terms of touches, opponent, situation, etc.) that it takes these generic “replacements” to gain 400, Foster is 100 yards above replacement. Since a given player cannot control who his backups are, and is therefore not responsible for their quality, Football Outsiders generates a constant replacement level for all backs. It is also important to note that they adjust DYAR performance for both situation and opponent.
This system also credits running backs for both quality and quantity of carries. A player who can handle a large workload with average efficiency is worth more than one of the same quality who cannot tolerate as many carries. On the other hand, if a player receives a huge workload and performs just as poorly as any generic replacement player would have, DYAR does not credit him with any positive contribution. William Green, a 2002 first round pick of the Browns, is a great example. Green amassed 568 carries over four years in Cleveland, gaining 2,109 yards for a meager 3.7 yards per carry average. Instead of getting credit for simply getting playing time, DYAR penalizes Green for performing so poorly in the opportunities he was given, putting him at 270 yards below replacement (the single worst total for any back drafted over the ten year period examined). In this way DYAR reflects more accurately the contributions of a running back to his team’s chances of winning.
Despite all of the benefits of DYAR, it is important to remember that DYAR is not a perfect stat. It does not account for pass blocking, indirect benefits to other players, or differences in offensive lines. For now though, DYAR is one of the most accurate representations we have of running back performance, and is more than serviceable in comparing the careers of different players. Now that we understand DYAR we can look at the graph of DYAR and pick number:
A logarithmic model is the best fit for the data:
Expected DYAR = 1196.40 – 217.57 * ln(Pick)
(305.45)Â Â (64.04)
R^2=0.1651, Root MSE= 492.44 (robust standard errors in parentheses above)
This is a descriptive model that tells us the value we can expect from a given running back based on where he was selected in the draft- essentially it tells us the price (in terms of pick number) that the market sets for any level of running back. If we take DYAR to be an accurate measure of running back performance, then pick number only explains 16.51% of the variation in NFL running back performance. This finding indicates that even though a higher pick is likely to perform better than a lower pick, in general NFL teams do a very poor job of evaluating running backs at the draft and predicting future performance.
When confronted with this sentiment, a skeptic might point to high draft picks like Marshall Faulk, LaDainian Tomlinson, or Edgerrin James that achieved overwhelming success in the NFL. For every LT, however, there is a Ki-Jana Carter, a Curtis Enis, or a Lawrence Phillips (all major first round busts). In fact, the careers of solid players like Rudi Johnson (614 DYAR) and Domanick Davis (592 DYAR) are more representative of what we can expect from a first-round running back than the stars we have come to anticipate. The average DYAR is actually only 614 for first-round running backs. Even more surprisingly, a third of the backs drafted in the first round are at or around replacement level, generating essentially no value for their teams. These figures for the top running backs suggest that a team should wait to select a running back, but a look at the round-by-round data shows that this is not necessarily the case:If the not necessarily the case:
While the proportion of replacement level players in the first round seems large on its own, when we see that a whopping two thirds fall into this category over the entire draft, the first round no longer looks so bad. Sixty-eight percent of running backs drafted are approximately replacement level. Even in the early to middle rounds (rounds 2-4), about two thirds of backs fall into this replacement range. By the time we get to round seven, a staggering 87% of backs are replacement level. Relatively few players manage to separate themselves from the pack and generate value for their employers over the course of their careers.
This produces an interesting juxtaposition. The percentage of replacement level players from round two to round five is very similar, with few managing to differentiate themselves, but the average DYAR is clearly higher for the lower rounds. These two trends indicate that a similar amount of players “make it” in these middle rounds, but that the earlier picks who do “make it” have more upside. All in all, most backs are interchangeable, but the ones with higher pedigrees appear to have a higher ceiling.
So far we have established the approximate price teams pay in terms of picks for a running back at a given level of performance. We have also found that teams do a poor job of predicting future performance when drafting running backs and that the majority of running backs are largely interchangeable. These facts are interesting, but on their own they do not tell us where we should optimally select a running back in the draft, or where the inefficiencies in this market arise. In order to figure out the best spot to select a running back, we need to compare the expected value of each draft slot to the salary assigned to each draft slot. We will examine this issue in a separate installment in the near future.
*All DYAR statistics are from Football Outsiders
I’m not sure that I agree with your conclusions given this data. The data clearly shows that DYAR decreases and % of replacements players increases as the round number increases and does so very predictably (.97 and .91 R^2 respectively with a logarithmic regression using only the round number of the pick). So teams are clearly evaluating something right.
I also think that your claim that most running backs are interchangeable is unmerited given the data. It only shows that 68% performed around replacement level. It does not explain why they did so. Some weren’t good enough to take their game to the next level. They were indistinguishable from their peers at the college level, but it was their absolute physical peak. Some were injured and never played again. Some were injured and played at replacement level (like Ki-Jana Carter and Curtis Enis who had major knee surgery in their rookie seasons). Some had major personal problems (Lawerence Phillips who was considered a high-risk high-reward at the time and who is currently serving a 31 prison sentence). So there is a wide array of talent represented in those “interchangeable” running backs, who failed to perform for predictable (but considered worth the risk) and unpredictable reasons.
So it’s possible that every pick was a good one considering the information available at the time. It may be the case that (and I’m just making these numbers up) one-third of picks are good picks and produce at the pro level, another third are good picks and fail to perform due to an unexpected injury or a known issue considered worth the risk, and another third were good picks who were similar to their peers in college but could not perform at the pro level becuase were performing at their absolute peak in college.
Hey Pete,
The reason that you got such high r-squared values when you looked at the relationship between round number and dyar/percent at replacement level is because you are using the average totals for each round- this artificially smooths out the data and gives you an artificially high r-squared that doesn’t say anything useful about the relationship between the variables being examined. Saying that round number explains 90% of variation in *average* DYAR is completely different from saying that round number explains 90% of variation in DYAR.
As for the issue of running backs being interchangeable, the model is a descriptive one- it describes what NFL teams can be expected to get in terms of production at each pick based on what they have gotten in the past. Whether a player fails to produce above replacement level because of injury, attitude, legal troubles or for any other reason independent of skill, he is still failing to contribute value to his team. For a team, it doesn’t matter if internal or external factors were the cause of the lack of production. Just as injuries caused players to be replaceable in the past, a certain percentage of players in the future will fail to be productive because of injuries. Let’s say 68% of players were replaceable in the past, but only 50% were replaceable in terms of skill and 18% as a result of injuries- it’s very unlikely that in the future we will suddenly go to 50% replaceable b/c of skill and 0% replaceable b/c of injury. Barring miraculous advances in medical technology, injuries will continue to prevent a certain number of backs from producing for the foreseeable future. So this factor should not affect the percentage of “replacement level” players drafted in the future. Replacement level career production does not necessarily describe a running back with marginal skill- it describes any player whose production is at a certain level of efficiency for whatever reason, be it talent, injury, or mental makeup.
I’m a big fan of advanced statistics and I think what you have is very interesting. It’s something I suspected for many positions in the NFL. What I think is key is that you identified draft position accounting for only 16.5% of the variation among backs. The data would lead me to believe that in the 1st round more talent is available, but also easier to spot. In the later rounds, there is still many highly talented people left, but for whatever reason they are not so recognizable. It shows that drafting is still part art form, part science, and we still need to value a good scout. I look forward to seeing your next article though tying it to salary. Then all we would need is the same thing for each position and it would give the optimal strategy for any team on draft day.
I thought people from Harvard would write better.
How about looking at the performance and the % allocation of salary cap to the OLine? I would imagine there is a better correlation of a RB’s success to the allocation of capital to Offensive line. Of course that makes a dangerous assumption that NFL teams are capable of effectively evaluating the value of OL by salary, and apparently they aren’t as good of talent evaluators as the public may think, however, such teams also will be more committed towards focusing on the run.