By Kurt Bullard

Now that it’s late July, football fans everywhere are looking forward to the opening of training camp, preseason, and the not-so-distant start of the regular season. Even though the so-called football experts struggle to forecast the outcomes of the 16-game season, I’ll try to put together a prediction model for the NFL season using a more quantitative method than the likes of Trent Dilfer.

The biggest challenge obviously is to come up with a sound way to estimate team strength, an endeavor that’s demanding considering the amount of personnel turnover each offseason and the lack of advanced statistics to evaluate player interactions. The method that I came up with uses Pro Football Reference’s Approximate Value statistic, the site’s best measure of trying to tease out individual talent. Then, using ESPN’s NFL depth charts, I aggregated each team’s per game approximate value of what I considered to be the “core” makeup of an NFL team: QB, RB, 2 WR, TE, Top 2 OL, the Top-4 “Front Seven” defensive players, and the Top-2 players from the secondary.

There were some exceptions to simply using last year’s AV. If a team had an absent starter that was injured or suspended for the majority of last year (e.g. Adrian Peterson), I used the player’s 2013 AV value. And, if ESPN listed a rookie as a starter, I took the AV of the backup with the reasoning that, if the rookie ends up starting, he should perform at least as good as the person that is backing him up. So, I used the per-game AV of Josh McCown as a substitute for Jameis Winston in my model since predicting rookie performance is another battle of its own. This will inflate the odds for teams who plan to stick with a struggling rookie through thick and thin, and hurt teams who find a phenom rookie.

To make sure this was a sound method, I tested it out on last year’s data and ran a regression to see if AV was predictive of the end-of-regular season Elo ratings as reported by FiveThirtyEight. Aggregated AV was indeed significant with a T-stat of 8.57. It was also a strong predictor of Elo, as the regression returned a .72 R-Squared value.

This model does not account for aging, but I make the assumption that in aggregating these AV totals, the positive and negative effects of aging on an individual will, for a team, net out to around zero. So this model favors aging teams and may hinder up-and-coming teams.

I then converted the aggregated AV for each team into an Elo rating so that I could later use that value to calculate the win probability of each team in each game this season.

With the mean Elo rating set at 1500, I set the possible range of Elo values between 1320-1900, since the standard deviation of Elo ratings has traditionally been 90 points. So, the Raiders, who had the lowest AV aggregate (76.34) were set to 1320, while the Seahawks (166.19) were set to 1680. The rest of the teams were set on the scale based on the following formula: 1320 + (360/(166.19-76.34))*AV. Some familiar teams fall to the bottom, while the Super Bowl favorites Packers and Seahawks floated their way to the top.

But Elo ratings don’t paint the whole picture, as teams who finish with worse records the previous year tend to benefit from easier schedules. I therefore ran a Monte Carlo simulation of each team’s season, calculating win probabilities based on the Elo ratings using the following formula: 1/(10^(Opponent Elo – Elo)/400)+1). Using Benjamin Morris’ conversion table from wins to playoff odds, I then calculated the odds that a team would make the playoffs for the upcoming year. I then normalized it so an average of 6 teams would make the playoffs from each conference every year.

To very little surprise, the Seahawks have the best odds of making the playoffs after having made the Super Bowl the past two years. The Dolphins may finally dethrone the Patriots in the AFC East, boasting the third-highest probability of playing into January, fifteen percentage points above defending-champion New England.

The much-touted Bills revamping, on the other hand, may not have the desired impact. The team only has a 40% chance of making the playoffs – the lowest in the AFC East.

In a similar vein, a potential 49ers collapse this year may not be the sole product of Harbaugh leaving, as the out-flux of talent brings the 49ers to a one-in-seven shot of making the playoffs. The Raiders, Titans, and Jaguars seemed destined for mediocrity again.

We’ll see who beats the odds this autumn. I’d just prefer to know sooner rather than later.

Could you explain where you get such math formulas? I don't quite follow the math. I find it surprising that you put Falcons at a 51% chance to go to the playoffs and the Ravens a 9% chance. Also isn't it strange how the AFC East teams are ranked really high but the NFC East teams are ranked pretty low. In fact every team from AFC East has a higher chance of making the playoffs than every team from NFC East. If 1 team from each division qualifies for playoffs then clearly something is a bit off.

I do also agree that most of those numbers fall straight off the sky. But, as usual, the field will prove maths wrong. This study sees the Dolphins battling with the Chiefs for the AFC superbowl spot while it has been historically proven that in football, the player who recently go the biggest paycheck before the beginning of the season tend to tank their stats. Basing the ELO from the previous year definitely does no justice for a few teams (My Chargers dealt with the Pats, Seahawks and the Broncos twice this past year for example) while I would think that basing in through even five years would be more accurate.

The Ravens were extremely close to beating the SB champion Pats in the playoffs last season, and the team has only gotten younger (more athletic) and better. The pass defense should be MUCH improved because of the return of Jimmy Smith and addition of Kendrick Lewis at safety.

The problem with using math at this point in the (off)season, is that we have a whole draft full of rookies that can’t be adjusted for in the statistics. They are statistical wild cards. This means that teams that typically draft well and scout well by adding valuable veterans who may have been in bad situations are discounted far more than others.

Math is very powerful but the unpredictable human variable in sports (bad calls, injuries, poor years, bad plays, choking at the wrong time, etc) makes it almost impossible for these types of analyses to hold much weight.

I think the big factor that’s being neglected is schedule. Miami has an easy schedule where the tougher teams are backloaded. It’s realistic to think that Miami might end up going 8-2 going into Week 12, then back their way into the playoffs. Remember, this is based on odds of making the playoffs, not doing well in them.

Interesting stuff, and I agree with the thread so far that the Ravens are way too low. I would also bet that the RAMS have better odds with their defense and now Nick Foles and Todd Gurley in the mix. Also the Vikings – I think you are not calculating a key factor in that they are now in year 2 of an entirely different offensive and defensive scheme, they also sustained a lot of injuries last year and have improved their roster on both offense and defense. Too many factors not included – mainly coaches, coaching staff, schemes and systems that the players will have to adapt to or be more familiar with in 2015.

“teams who finish with worse records the previous year tend to benefit from easier schedules”

Not true. Only two games each each year have anything to do with a team’s previous season record – and those two games are against teams that finished in the same position in another division. Under the current system, opponents for 14 of the 16 games in a season can be predicted years in advanc e.

What you said was once true, but it hasn’t been true since 2002, when the current scheduling system was adopted.

That’s not entirely true either. Previous seasons record still plays affects which teams in each division you play at home/on the road.

I acknowledge the fact that this is pure statistics, but the beauty of football is that statistics are not quite as accurate as they seem. You say you used PFF's method, well as a matter of fact ESPN released an article ACCORDING to PFF how many players each team needed to reach the Super Bowl. The ravens were the second team that needed the LEAST players, with only two. You have no way of proving how rookies will play. So of course you did something that seems very mediocre in the likes of an article like this one (about stats), which is that you put veterans ahead of the rookies. In the case of my Ravens, do you expect Marlon Brown to perform as well as our first round pick Perriman? The Ravens have been–besides the Patriots–the most consistent team in the past 7 seasons (since the Joe Flacco era began). Out of seven seasons they made the playoffs 6 times, that is roughly 86% of the time. As a loyal fan of my team, you saying that we have a 9% chance of making the playoffs–with inaccurate stats–is very disrespectful. So If you know nothing about the rookies–which virtually you can't–then you can't write an article like this. That's the beauty of football, the MYSTERY.

Looks like this method will always over value those teams with a crop of elite players but with a poor supporting cast, and undervalue teams that may not have so many superstars but have a quality cast of backups and defensive players across the field outside of those accounted for.

thats why miami have gone in ahead of the pats, their top 12 players are great, but the rest of the roster (not really accounted for here) is a little way behind whereas i’d put the pats as having a better 53 man roster all things considered. Maybe miami’s top 12 players are better than NE’s top 12 players but that doesnt win you all the games.

Really interesting work, and it doesn’t look too far off so I think you are really onto something. But I think this needs some refinement.

My two cents: You need to account for every position on the field. You can’t just ignore things like over half the defense and 60% of the offensive line. On many teams, the #3 WR is much more important than the TE. And an outlier at K will drastically change the fortunes for a team.

The key is to give the proper weighting to each player on the roster, and that probably should include “coach”. For example, a couple of the places where your odds diverge from the betting markets is that you have Miami and KC with higher odds in the AFC, and Denver and NE with slightly lower odds. I suspect this is because your formula isn’t valuing the QB position enough. In the modern NFL, the QB position is probably as important as all the other offensive positions combined. Also, NE consistently outperforms its “talent on paper” because Belichik is such a good coach.

While I enjoy this as a quantitative exercise, it’s a problematic analysis on both football and statistics levels.

First, the study equates (not associates) team success with a pool of player talent. We can intuit that “dream team” organizations would rate very highly in this rubric, but that’s questionable in the real world. For example, the deeply flawed Dallas teams of the last 10 years featured some top-end talent, but no depth. While this obviously has a very profound influence between the lines towards the end of the season, it doesn’t appear to be accounted for in the math. As mentioned by other posters, variables like cohesion, coaching, and organizational culture are ignored. These are highly relevant–just ask Dan Snyder. To be bluntly logical, the structure of the study represents the fallacy of composition–the quality of the parts determines the quality of the whole. Composition is always involved in a generalized analysis, but we have to keep it in the foreground.

As a researcher, I think in terms of questions, and there seems to be some ambiguity here. The question the model is *intended* to address is “what is each team’s likelihood of reaching the playoffs?”, but as I read it, the question *actually* being answered is different: “which team has the best collection of core talent according to PFR’s AV metric?” This is important, because the independent variable here is not talent–it is someone else’s assessment of talent. Without validation, we’re really testing the AV metric, not the teams.

I was pleased to see that Kurt validated the model against last season, but I submit that a good-looking regression can be deceptive. What does that high R^2 actually mean? It means that teams with a better talent pool were more likely to reach the playoffs. That’s true, but trivially so–it circles back to the independence problem mentioned above. Shouldn’t the question be “

to what degreedoes the AV pool associate with success?” Don’t we have to answer that question before we can validly progress to the predictive question?I don’t want to knock the project too much, because I think it is an interesting start. The AV variable looks like a good way to develop a talent index, which could then be used in a multiple regression or PCA/factor analysis that would provide a more robust prediction.

Anyway, interesting read–keep statsing!

🙂

I’ve been a loyal and faithful Dolphins fan since 1970. If I get cut, I bleed aquamarine. Statistical analysis of a given team’s player’s abilities is a nice staring point. However, I want to address the comment(s) made by Gus Abbott. COACHING is a huge part of a team’s win/loss record. It plays an even BIGGER part when a team makes the playoffs. Joe Philbin is a LOSER. I gave him PLENTY of chances to impress me with his coaching abilities, and MORE IMPORTANT, his Coaching DECISIONS. Here are Ryan Tannehill’s SITUATIONAL Stats: http://www.nfl.com/player/ryantannehill/2532956/situationalstats There is a SUBSTANTIAL difference between Tannehill’s QB rating between his opponent’s 49 yard line to their 20 yard line AND elsewhere on the field: 77.8 Rating versus 81.4, 94.8, & 101.8. He’s had 13 sacks & 4 interceptions between the opponent 49 and 20 yard lines. This is all due to Philbin’s insistence Tannehill remains in the pocket. ALL THE TIME. Well….this will be Tannehill’s 4th season at QB. He came out of college with enough mobility, one could call him a semi-scrambler. Now I DON’T advocate Tannehill become a running QB. But a GOOD/GREAT Coach will utilize a given player’s natural abilities. When the Dolphins have to overcome LONG YARDAGE for ball control are the times when Tannehill gets sacked or gets picked off. I’ve SEEN Tannehill on set designed runs/scrambles. That’s a great way to slow up the pass rush, give confidence to a young QB, and keep the defense honest AND guessing. He’s EXCELLED in this role. The very few times I’ve seen this, I’ve also seen Tannehill SHRED the pass defense LATER on other drives. Such is the EFFECT of his mobility. I don’t understand why Philbin REFUSES to incorporate such plays on a regular basis. His STUBBORNNESS makes him an INEPT HACK. I’m excited about the Dolphin’s acquisition of Ndamakong Suh and the other DTs they drafted. I’m a firm believer that the Dolphins’ Defense should and will be better what with their line and secondary. But in important games, the defense will get tired out way too quickly. Most of that will be because Tannehill will be forced into high risk situations, get repeatedly sacked or intercepted. That will bring the ‘Fins defense back out on the field. Meanwhile, Philbin will be scratching his head, CLUELESS as how to rectify Tannehill’s errors.

The Cardinals must have been doing it with smoke and mirrors, because I think they have the best roster they’ve had since moving to Arizona this year, yet only show a 29% chance. I guess we’ll see how predictive this formula is. Expect Mathieu, Peterson, Cooper, Weathers poon to be must improved after recovering from injuries. Oline and running backs will be much improved, which will help the passing game and play action, especially with a healthy Palmer. I expect Brown will build on his rookie season and Floyd and Arrington to bounce back. Defense lost Cromartie, but should be improved overall.

Really appreciate this article.

First of all, when an article’s title is “A Way Too Early Prediction of the NFL Season” you need to take it all with a grain of salt. I think it is assumed that someone of the author’s intelligence understands that this game isn’t simply a mathematical model. Don’t get pissed if his model doesn’t support what you personally believe.

This is fantastic. The people replying negatively to this thread I think are just looking at the charts and not looking at the formula behind it. Please continue to post stuff like this. I’ve grown tired of the NFL reporters blind speculation.

The guy generated flawed odds by normalizing them. He was lazy and not only didn’t account for conferences, but didn’t account for divisions, and wildcards. The numbers are completely worthless. Probabilities that ignore structural restrictions aren’t even really probabilities at all. At least not probabilities with any substance to them.

This model fails to account for the fact that only 6 teams make the playoffs from each conference. If each 100% adds up to 1 playoff team, you have 6.3 playoff teams from the AFC, and only 5.7 from the NFC.

The final table does nothing to take Divisions or Conferences into account. The table suggests that based on probability the AFC will somehow have 8 of the 12 playoff teams

Not quite, but they didn’t factor conference into account. If you factor in all the probabilities, you’d have 6.4 AFC teams and 5.6 NFC teams.

This is flawed as it fails to take into account coaching/staff changes and scheme changes. You need to factor in scheme…sometimes even at the pro level effective schemes and adaptability of coaching staff to adjust gameplans plays a huge factor in outcomes

How is this called a "model" and "simulation"…? All it's doing is crunching team schedules with relative strength variables (that the author had no part in developing). It doesn't simulate actual gameplay, it's just a fancy ranking system. The sophistication is the equivalent of adding up all the bullets one army has and all the bullets another army has and says the one with the most wins. It assumes that the bullets are the best way to determine the chances of winning, and all that is required to win is to have more bullets, regardless of how they're used. So a top defensive line against an otherwise stronger offense but with the OL as its weakest point can easily turn the tide of a game, which this in no way will account for. Even if its statistically significant (of course it is, the people that developed AV and Elo wouldn't have published it if it weren't) is still only giving you a correct ranking 72% of the time… so 9 of 32 places are likely to be incorrectly predicted.

I think you’re being unnecessarily harsh. This is an undergraduate project, and a fairly sophisticated one for anyone outside of a rigorous mathematics department. His two-step methodology is very clear; both ‘model’ and ‘simulation’ are appropriately used. I agree that there are limitations (as I pointed out in my earlier post), but any model is an abstraction. I’m confident that you don’t invent your own inferential tests, but use tools developed by others (T-tests, ANOVA, chi-square, etc.). So it’s senseless to call him on the carpet for using existing analytical tools like Elo or AV.

In the end, analysts analyze. That’s what the author is doing, and he deserves credit for creating an analysis and putting it out for public consumption/comment.

Well Said, Eric.

It’s really sad that people can’t just appreciate this for what it is. Just a tool at analyzing distinct data sets.

Author,

It would be very interesting to apply the models to data available at the end of 2013 to yield its prediction of 2014 odds. Can you please do this?

I was going to request the same thing to validate how close the model is…. use 2012 data and project 2013 and so on…

Three rounds should do the trick…

The math all looks pretty sound. I think the big problem with this model is the assumption about aging players and progressing rookies netting to zero for each team. Consistently good teams in the NFL, (the Patriots, Green Bay, Pittsburgh, Baltimore, Seattle, NY Giants just to name a few) are very good at assessing the degradation of aging players, who to sign and who not to, avoiding salary cap situation and proactively replacing starters through the draft and free agency a year or two before so they can progress. By following the assumptions made in this model, we assume that all teams are equal at these and that is simply not true and is really the basis for what teams have success in the NFL. The model also neglects to show the coaches influence on team success and anyone who knows football knows that the coach is more important than any single player or position. In addition by seeing some teams with no franchise Quarterback high on the list, it leads me to believe that the QB position was very undervalued when creating the formula.

The current limitations to your model are that it underrated players who played through injuries last year, it fails to account for the quality of coaching staffs, difficulty of schedule, and completely ignores the impact of rookies.

This is a fascinating exercise. Attempting to calculate the likelihood of making the playoffs is certainly complex. Each year, roughly half of the playoff teams turn over, with injuries, free agency changes and coaching changes.

Who will rise? Who will fall? Teams that have the most roster depth can overcome injury or free agency the easiest.

This would be fun to revisit at each quarter mark of the season, don’t you think?

Why did you choose Profootballref’s player evaluation over Pro Football Focus’s? Pro Football Focus watches the film as well as looks at the stats. It seems like a more accurate player evaluation and it includes offensive line and all defensive players. They also have player scores based on formations so if a player signs via free agency into a new defensive scheme, you can check their grade in that scheme.

This is brilliant analysis that requires a minor correction to account for how NFL teams qualify for the post-season playoffs.

As a baseline, each of the eight NFL divisions has four teams. Every team in the NFL technically has a 25% probability of winning their division championship to advance into the playoffs.

Ultimately, the odds of making the playoffs comes down to the probability of having the best record in their respective division OR having the first or second best record of non-division champions in their respective conferences.

Overall, great work!

There is a calculation that is missing, That is the teams internal divisional play. You cannot have a probable calculation where the winner of the division has a 42% chance of making the playoffs as in the case of the NFC east. What you are implying here is more than likely a team or more in the conference will have a better record in other divisions but will not qualify for the playoffs. This may be true but the inevitable winner of a divisions weight is not factored into the calculations

Enjoyed reading the article, Mr. Bullard, but if I may make a suggestion:

The “core players” that comprise your aggregate should be weighted more than the other players on the field, but the other starting players should factor into the equation, too.

An example of my reasoning is that a defense with three outstanding players in the secondary is not going to give up big passing plays as often as a team with two outstanding secondary players and one mediocre player. Offenses look for holes in defenses and exploit them. Holes are not necessarily “core” players, but “core” players are all your aggregate considers.

Analysis is way off. In lots of places. I care too much about my own time to spend an hour picking this article apart, so I will just pick apart 1 aspect: a team’s defensive secondary is only as strong as its weakest link. The “top 2” are irrelevant. Liebig’s barrel theory. Good OCs/QBs will expose poor DBs.

Can you use the same analysis and do an assessment of the accuracy of this approach? Shouldn’t be too tough and would either back up or shut up any nay sayers.

If all teams were equal, each team would have a 25% chance of winning its division and a 12.5% chance of making the playoffs as a wild-card team (2 wild-card spots divided by 16 teams in the conference). That’s a total of 37.5% chance of making the playoffs for each team. 37.5% times 16 teams per conference is 600%.

Of course the teams aren’t equal to each other, so some teams will have a higher likelihood of making the playoffs than others. But as long as there are six playoff spots per conference, the combined percentages for all teams in the conference must be 600%.

The combined odds for the AFC teams in this list is 640%.

The combined odds for the NFC teams in this list is 562%.

Rounding up or down to the nearest integer can account for some of this discrepancy, but not all.

That’s what I said – the numbers don’t add up. He needs to rerun the numbers to account for the conferences (if not the divisions as well).

I was slightly confused when you describe how you tested the regression, did you use the data from the 2013 season to predict the 2014 season, or did you test the data set from 2014 to predict results for 2014? Obvious issue being that you would have unpacked the AV statistic and simply predicted what would happen if present rosters played at the same competency but with a newly weighted schedule. Otherwise, impressive work!

I read this article and after observing that the odds your model have calculated are wildly different to the betting market, I was provoked into writing about it at:

http://winnerswinonsports.com/

