The following is the abstract from the paper “Using the NFL Gambling Market as an Alternative Asset Class” by Kevin Meers, Sam Waters, and Zack Wortman. Please find the full paper here. We will publish our projected probabilities each week before NFL games begin, starting for Week 2.
A particular focus of modern portfolio theory involves diversifying risk through identifying and investing in assets that are uncorrelated with one another. In this pursuit, many investors have explored so-called “alternative asset classes” that are not traded on major stock markets. In this paper, we identify the market for gambling on games in the National Football League (NFL) as one such alternative asset. By modeling various betting outcomes, we develop multiple betting algorithms that provide returns from 7% to 20%, depending on what level of risk the investor would like to assume. These returns are, in theory, totally uncorrelated with market risk, and therefore valuable to any portfolio manager; further, they appear to dominate multiple market indices.
Interesting paper. Did you look at the returns if you adjust it for the bookmaker’s “juice”? Most spread and over/under bets are -110, so it’s not as simple as winning $100 or losing $100. It’s win $100 or lose $110.
I agree with the general premise that there are inefficiencies in this market, but this article really doesn’t do much in the way of showing how they can be exploited. I have a bunch of questions/comments:
1. As Joe (first comment) eluded to, vig is almost always 10%. I think Pinnacle’s 5% was cherry-picked. What do the numbers look like when you adjust for the higher vig? The whole article is based on Vegas lines; Vegas vig should be used IMO.
2. Home teams aren’t always favored. You may clarify this somewhere that I missed, but why not just call them favorites and underdogs.
3. Do you have the source data in Excel?
4. I assume based on the summary that you are using DVOA after Week 3 to make your Week 4 wagers and DVOA after Week 14 to make your Week 15 wagers and so on. Is that correct or are you using the entire year’s DVOA for every game of that season?
1) We chose 5% because, as Zachary noted below, it does exist. We haven’t run the numbers on a 10% vig. Definitely room to expand the analysis there.
2) We used the home/away terminology because that is the way the line data was structured. I.e. if Detroit was at Minnesota and Detroit was favored by 3, Minnesota was listed as +3 instead of Detroit -3. It was simply easier to refer to analyze this way.
3 and 4) Yes, we have the original but we cannot share it because as Nate noted below, the week-by-week DVOA measures are privately held behind FO’s pay wall.
Although there is some variation, the typical Vegas payout for ‘straight’ bets like the spread is ‘-110’ so the vig is 9.1%.
Football Outsiders charges for historical DVOA data.
http://www.footballoutsiders.com/store/premium-access
Much like ‘used cooking oil’ there’s not really that much to be had in Vegas sports lines – bookies can refuse bets or set lines that are closer to true in response to large scale investor action.
Ooops… I did the math wrong. if you play both sides, you would pay 220 to get 210, which is a vigorish of 10/220 = 4.54…%.
2 things. Firstly, I’d like to point out that while Pinaccle’s vig was probably cherry-picked, it exists which means you can make your bets through Pinaccle’s service. Secondly, I would just like to display some doubt about their conclusion. I don’t think more info would help their info and if anything would hurt it, as it would convince bettors to bet against the model due to misplaced confidence in their own abilities. I would like to see a model where they don’t bet every game, but instead bet only a few games a week which their model is most confident about.
Pinnacle doesn’t accept American customers.
This is totally garbage! Sure, use past data, regress it, and of course you can predict past data better than people who were making markets in real time. Do the same in stocks and sure you can find a good regression model to beat the market too. Pure garbage. Shame on Harvard.
If you’re interested in out of sample testing, which it sounds like you are, we’ll be publishing projected probabilities for this season.
That’s great. I’m willing to bet you cannot beat 52.4%. Any takers?
Curious if you still intend to publish probabilities in advance of each week. I share Richard’s skepticism regarding your results, but this is your opportunity to prove us wrong.
I haven’t seen any of the game predictions as promised. What’s going on with that?
We have decided to publish them at the end of the season. We don’t want our content to be flooded with who to gamble on in a given week, but we will publish the results after the season because, as you have said, out of sample testing is important to the academic process.
Just a suggestion: have an ironclad method of documenting your results as they go along during the season. I’m sure if you come up with good results, lots of people will say it was not on the up and up. I am not sure how to go about proving you weren’t making it up (or will not be making it up in the future), but may share it with a few people beforehand and those people would essentially be putting their names on the line if people accuse them of lying.
If these guys were charging for their predictions, I would say we should worry about their documentation. But this is good work being provided for no charge. Any of us are free to do our own analysis if we think we can improve on it.
In regard to the idea above that it’s easy and/or meaningless to slap a regression analysis together — I really don’t understand this criticism. DVOA is a backward looking statistic, in the sense that it is constantly being revised throughout the season based on new data. But they account for this by controlling for the time of the bet throughout the season.
And with as large a dataset as they used, and as many simulations and controls, the logistic regression can reliably do what it is meant to do, which is to predict the probability of an outcome and the marginal effect of each independent variable. Unless there is some reason to believe that player performance will somehow be wildly different this season than in the past, the model should hold under these conditions.
So I’m not taking any bets, but I do think performance will meet or exceed past performance this year. And if not…they had the fun of experimenting with a new method.
Hey guys, great paper; very interesting stuff. I have a couple of questions as I took Stats as an undergrad but that was a while ago.
As it relates to covering the spread:
Am I understanding correctly that picking the home team is always the correct move when the signs match (e.g. even the situation where: the Home team is favored, the home D is weak, and the away O is strong)?
Additionally, if the signs do not match, does this mean the away team should be picked to cover (ie the home team picked to not cover)?
Much appreciated!
You state 2 methods of betting….over/under and sides. There is also the money line, did you look at those?.
Also, did you look at any linear regressions? Only logistic regressions?
Did you look at any hold out samples, or compare regressions from different time periods?
The ProFootballReference is Vegas lines, with an assumed 10% vig. I think that you should use the more or less standard 11/10 for vegas lines.
On tables 2 ans 3, did you use predictors with p values as shown, or only the ones below .05?