By Austin Tymins
This past week, four of the top five teams in the Associated Press College Football Poll hailed from the SEC West Division. This has led many, such as Nebraska coach Bo Pelini, to question whether ESPN’s ownership of the brand-new SEC network, which launched this year, may be responsible for such a strange occurrence.
“I don’t think that kind of relationship is good for college football. That’s just my opinion,” Pelini said at his news conference. “Anytime you have a relationship with somebody, you have a partnership, you are supposed to be neutral. It’s pretty hard to stay neutral in that situation.”
As a PAC 12 fan, I often wonder about East Coast bias in college football, especially around Heisman voting. For example, was Andre Williams of Boston College really more deserving of the award than Ka’Deem Carey of Arizona? I—and most of the west coast—would beg to differ. This, I argue, translates to the college football rankings themselves being biased against certain conferences. As an Arizona State fan, it’s particularly frustrating to see my team consistently ranked worse in the AP poll than any advanced metric would suggest.
To see if these suspicions are valid, I have compared the AP college football rankings dating back to 2005 to the Football Outsiders’s stat F/+ for the season’s top 25 teams. F/+ is a combination of the Fremeau Efficiency Index and the S&P+ Ratings. Combined, these stats account for just about everything in college football on a play-by-play level including: play success rate, EqPts per play, drive efficiency, and opponent adjustments. It is probably a fair assumption to say F/+ is the best quantitative measure of team skill that exists at this moment.
Below, I have fit a second-degree polynomial curve with heteroskedasticity-robust standard errors to the data to approximate the average AP ranking for each F/+ rank. Which team has the best F/+ rank since 2005? The 2011 Alabama Crimson Tide that massacred LSU in the National Championship game had a rating of 53.9%.
With this graph in place, I decided to test whether conference bias exists in the AP rankings. To do this, I made dummy variables for each conference (including the Big East, may it RIP). All teams that weren’t in a power conference were placed in the Not Power Conference category. Also, since many teams have shifted conferences over that timespan, I have simply placed them in the conference in which they played the particular season. So for example, Missouri was a Big 12 team until 2012 when it joined the SEC, TCU didn’t join Big 12 until 2012, etc.
AP Ranking Bias
I dropped the non Power Conference dummy to avoid perfect collinearity, leaving us with the above table that shows the effect of conference on AP ranking. It is very clear in this case that other conferences are discriminated against when compared to the SEC. Every single coefficient for conference dummies is more positive than SEC, which suggests that being in any conference besides the SEC will lead to a worse ranking in the AP Poll when controlling for F/+.
These results show that a team from the PAC 12 is on average ranked approximately 1.15 spots worse than an equivalent team in a non-power conference, and 1.50 positional spots worse than the same team from the SEC. The former Big East conference also experienced bias compared to non-power conference teams (East Coast bias anyone?), although not to the same extent as the SEC. The largest bias appears to occur on teams from the ACC, where teams are ranked about 2.63 spots worse than expected and this coefficient is statistically significant at the 1% level.
In addition to conference bias, I have found support for a “program prestige bias” whereby the historically good programs Oklahoma, USC, Ohio St, Notre Dame, and Alabama are ranked better in the AP than the advanced metrics would recommend. The last line in the table shows how these teams on average are ranked 1.44 spots better in the AP Rankings than F/+ would predict. Also interesting to note, these results are very nearly statistically significant at the 10% level.
While these results are rather interesting, they aren’t thoroughly scientific because of the lackluster p-values. To tighten the confidence intervals, I would need more data to work with. Unfortunately, the Football Outsiders’s F/+ data only goes back to 2005 and there doesn’t appear to be week-by-week F/+ data. Since these are the final AP rankings that account for the inter-conference play of bowl games, I would expect the midseason AP Rankings to be even more biased.
What teams are hurt most by conference bias? Based on F/+ ratings and a polynomial of best fit, we can predict where a team should be ranked according to F/+ and look at the difference between that predicted figure and the actual AP Ranking of teams with multiple seasons in the Top 25.
As I expected, my Sun Devils do experience ranking bias, although I surely didn’t expect them to be the most biased against in college football!
|School||Bias Per Season|
I know the sample size is small, but are you able to check if these conference effects are dependents on how good the ranking is? In other words, are top-10 teams more affected than those between 15 and 25?
Austin this is a really cool post, but I want to take issue with your stance on the Ka’Deem Carey vs. Andre Williams debate. As evidenced when they played, Arizona was way better than BC. BC was not expected to get to a bowl, and Arizona lost to 4 ranked teams and at USC. Arizona gained a total of 6,000 (or so) yards in 2013, and BC gained about 4,500. Andre Williams was the sole option on offense, and still he dominated. In the 3 game stretch where Williams averaged 300 yards and 2 TD’s per game, BC had 400 total passing yards combined in all 3.
More importantly, you implied that BC is subject to some sort of east coast bias that I can tell you as a BC fan is 100% untrue. BC football is irrelevant to the vast majority of Bostonians, let alone the national media. I would argue your ACC (and formerly Big East) bias lurks more in the southeast (you know, near the Southeastern Conference) with Miami/FSU/VaTech/Clemson.
“Combined, these stats account for just about everything in college football on a play-by-play level including: play success rate, EqPts per play, drive efficiency, and opponent adjustments. It is probably a fair assumption to say F/+ is the best quantitative measure of team skill that exists at this moment.”
Ranking teams based on those statistics have never passed the sniff test. That’s a fact. So you are coming to conclusions on top of set of flawed metrics. Good science on top of bad math is just more bad math. Sorry. This is just drivel.
I enjoyed reading this piece. When I think about “SEC Bias”, intuitively I feel like this occurs for middle-tier teams in the conference, rather than for the conference as a whole. Generally I’d say it would be the 4th through 7th best teams in the conference each year, or teams that should be ranked roughly 10th through 25th, this year being (arguably) Alabama, Georgia, LSU, and Missouri. I think at this point saying Alabama should be lower than #3, Georgia should be lower than #9, or LSU should be lower than #16 is a much stronger argument than saying Mississippi State or Auburn does not deserve to be in the top 4.
In my opinion, in recent years the top 2 or 3 teams in the conference have been generally better than the other power 5 conferences’ top teams, but the middle and lower tier teams aren’t any better than the middle and lower tier teams in the other conferences. Yet the middle-tier SEC teams are often ranked higher than the middle-tier, and even upper-tier, teams in the other conferences. Supporting this “differential SEC bias effect” idea is the fact that 6 of the 10 SEC teams that have been in the conference since 2005 listed in your team-by-team bias calculation table had positive bias values, indicating they were generally ranked lower than they should have been, not higher. Ole Miss and Tennessee (two middle/lower tier teams recently) were the only two teams with a large negative ranking bias per season. Maybe you could account for this differential effect in your model with some sort of interaction between the SEC indicator and conference standing or F/+ value?
Your model also assumes that the distance between AP rankings is homogeneous. In other words, it assumes the difference in the voters’ opinions of teams 1 and 2 is the same as the difference between 2 and 3, is the same as between 3 and 4, etc. This is not a valid assumption. In the current 10/26 poll, the difference in voter points between Mississippi State (1) and Florida State (2) is 33 points, while the difference between Florida State (2) and Alabama (3) is much larger, at 163 points. Therefore, AP voter points may be a better measure of voter opinion than rank as your dependent variable. You could even extend the model to include the “other teams receiving votes”, which would give you more data points.
Very nice work. I think you are onto something here.
Hey Harvard if you are so smart do you think you could at least get the tags right for your article? It is ARIZONA STATE not ARIZONA SUN DEVILS especially when the guy is talking about them in the damn article????
Who are the Arizona Sun Devils?
You switch the signs of your metric between the conference and team data without any explanation. You should clarify whether positive or negative means “discriminated against” to avoid confusion.
Also, p values can’t be “almost significant.” That’s not how NHST works.
If your coefficients aren’t even statistically significant at a .05 level then I doubt the coefficients for your conferences aren’t going to be statistically different from each other.
And LOL throwing in 30+ dummy variables when you already have sample size issues. You say OSU is “Prestigious” but then they are 2nd most biased against team. Do you even know how to interpret statistics?
PS. That SEC bias coefficient is not significantly different than the intercept aka non-conference teams. So you’ve basically shown with your data that there is no statistically significant case of bias towards the SEC.
What’s the SE?
Thanks for exposing the ranking bias which I believe is based on financial considerations
I’m confused, in one chart ‘negative’ means positive bias, and in the other it means negative bias?
I must be mis-understanding something here. In the first table, a negative number means a bias to be ranked higher than they should be. If that is the case in the second table as well, why would the prestigious teams in the first chart have what appears to be a ranking bias to be lower than they should be (Notre Dame: 2, USC: .56, OU: .046). Am I missing something here?
Very interesting analysis! It’s exciting that someone is trying to quantify the (potential) existence of SEC bias.
Could you report the p-values associated with each conference dummy variable? Also, how was the model’s goodness-of-fit?
Well done and very interesting. However, to be more compelling (and relevant), I think you need to also include a W-L record metric somehow. Bottom line: it doesn’t matter who is a better team, what matters is who wins on the field. The “better” team is more likely to win but won’t always win because a given matchup has so many random events–we refer to this as “an upset”. Only the winning team raises the trophy, and it doesn’t matter if the winner is not the “best” team. For the same reason, an undefeated team has earned a ranking better than a 1-loss team that, according to certain metrics (in this case F/+) has not. For the same reason that a team that is not the “best” can win the NCG and raise the trophy (because they won), winning ought to influence ranking more than is reflected by F/+.
This is also a philosophical reason why the CFP committee is certain to be disastrous. Their stated purpose is to “select the four best teams.” First, there’s no such thing, and second (and more important), while being the “best” team (if there is such a thing) makes winning a game, a conference title, or a NC more likely, we’ve never, ever suggested that being the best is necessary or sufficient. Rather, what matters is winning on the field. Can you imagine a situation in which a huge underdog won the NCG, but the #1 ranking and the trophy went to the loser because, well, they are the better team according to consensus or some set of metrics? This is exactly what the committee is doing by selecting the “4 best teams” instead of selecting the teams that have earned it on the field whether or not they are the “best”.
I don’t think your model is fitting as Oklahoma is perennially overrated. Also, Utah being the team that benefits the most from the ranking system is laughable. The problem is the ranking system itself that determines who plays for the most high profile bowls and the fraudulent nation championship. A real playoff system is the only way to determine which team is best at the end of the season rather than on paper.
Interesting analysis. Would you be willing to share your data? I am in a sports analytics class currently and would like to try and incorporate into a predictive analysis for the upcoming playoffs.