By Bill Lotter

If time, money, and brain damage weren’t an issue, how would you decide if Nick Foles or Sam Bradford is better? You’d take a generic team, plug in Nick Foles, and play a million games. You’d do the same with Sam Bradford and then see who had the most wins. On any given day, one QB might win or lose, but in the long run, you’d expect the best QB to prevail. Obviously we can’t do this in real life, but we can in computers. We just need to be able to *simulate *football games. That sounds great, but how do we do it?

The problem/awesomeness of football is that it’s so complex. There are so many possible game situations. In baseball, there are at least nine times a game that the away team will be at bat with zero outs and nobody on. How many times do you think it’s going to be 3^{rd} and 3 on the 33 yard line? On top of having so many possible *states, *the NFL has much fewer games. What this means is that it’s much harder to simulate games and there’s much less data available to inform these simulations. But if we could, we could pretty much solve football…only exaggerating a little bit.

Ok, where do we begin? First we need to say what we mean by a game state. We want to make it as general as possible, while at the same time only considering the variables we think are most important. With the already daunting feat in front of us, I think we should restrict ourselves to “normal” game situations. This is when the goal is simply to score more points than the other team and time isn’t a major influence, i.e. the 1^{st} quarter, the 2^{nd} quarter except the end of the half, and, for the most part, the 3^{rd} quarter. Then, since drives are the building blocks of a game, we should simulate drives. To simulate drives, we need to simulate plays. What is the state that puts a play in context? It’s the down, distance, and field position.

So we’ll be simulating drives, which consist of plays and the state of the drive is defined by three numbers: down, distance, and field position. Now, to do our simulations, we need to know all the things that can happen at any particular state and their corresponding probabilities. For any play, the offense will either call a run or pass, which will either result in a turnover or some amount of yards gained (or lost). We need to estimate the chance of all of these outcomes. The key issue is that these chances depend on the situation. Teams will have a much different strategy, resulting in much different probabilities of outcomes, depending on the down, distance, and field position. Thus, for every possible combination, we need to estimate the chance the offense will throw an interception, fumble, gain 1 yard, gain 2 yards, 3 yards,….

How would you go about estimating this probability *distribution*? You would first look at historical data, particularly data that is relevant to today’s NFL. Below is the overall percentage of plays that were passes by year in the NFL, which I calculated from data from pro-football-reference.com.

Not surprisingly, passing has increased over recent years. Let’s use data from 2010-2014 in our calculations. Now, if we’re interested in 3^{rd} and 3 on the 33, we’d look at every time that happened and the corresponding distribution of outcomes. The problem is, as mentioned before, the number of times that this occurs is small, so if we just use the empirical estimates, they will be very noisy and not a good representation of the true distribution. How do we get around this? Well, even if we don’t have tons of observations of 3^{rd} and 3 on the 33, the observations on 3^{rd} and 3 on the 34 probably tell us a good bit about what happens on the 33. In fact, every observation of 3^{rd} and 3 probably tells us a little bit about 3^{rd} and 3 at the 33, with observations closest to the 33 telling us the most. Mathematically, this concept is called a *prior*, and the way in which we form our beliefs about what happens on 3^{rd} and 3 at the 33 by combining observations with our prior beliefs/data is *Bayes Rule*.

We have the general framework for our Monte Carlo simulations, now we just need to dig into the details. For a motivating example, let’s go ahead and make the comparison of Bradford and Foles. What we want to know is, if we kept everything else constant, who would lead to more points scored? QBs will have their main impact in the passing game, so for our simulations, we first we need to know how often a team will pass. To estimate this, I again scraped Pro-Football-Reference and implemented Bayes Rule. I’ll spare some of the nitty gritty details, but for those familiar with such methods, feel free to read the math interlude, otherwise, skip to the graphs.

**Math Interlude (colorful graphs after this)**

Pass probability is a Bernoulli variable and with a Beta conjugate prior, we just need to determine the hyperparameters, e.g. pseudocounts. The problem boils down to saying how much an observation at one (down, distance, and field position) “counts” as one at another, which I took at the product of a delta function and two Gaussians as shown below

Where (d_{1},x_{1},p_{1}) is the down, distance, and field position of the first observation. I made the widths of the Gaussians to be non-stationary and picked values that made sense football-wise. Taking the MAP estimates gives the plots below.

What do these graphs show us? For first down, teams are actually more likely to run, especially when they’re backed up against the goal line, unless they are right at the goal line, in which the pass probability jumps up. This makes sense as this strategy would seem to lower the chance of a safety. On 2^{nd} down, the chance that a team will pass is highly dependent on the yards to go for a first down, which is represented on the horizontal axis. Each point on the graph represents a particular state: 2^{nd} down on a particular yard line with a particular yards to go. The color represents the chance the team will pass, which can be mapped to a number based on the scale on the right. On 3^{rd} down, teams are very likely to pass, unless its 3^{rd} and 1 or 2.

The first step is done: we have solid estimates of when a team will pass. Next, we need to estimate how many yards a team will gain if they do pass. We can use the same approach, but we want to estimate a different distribution for each QB. We do so by using each QBs own past performance to estimate the distribution. Below, I plot these estimates for one situation in particular: 1^{st} down and 10 on the 20 yard line. Again, we’ll have a different curve like this for each down, distance, and field position. As you can imagine, this involves quite a bit of calculations and attention to detail, but it is essential to simulate a football game as best as we can.

In the top graph, the blue curve estimates Nick Foles’ yards gained distribution by pass for 1^{st} and 10 at the 20. It incorporates every pass he has ever thrown, even if it wasn’t at 1^{st} down on the 20, with passes “closer” to 1^{st} and 10 at the 20 having a bigger influence. The red line contains the distribution for the league as a whole from 2010-2014. The green line is if we also use the league as a prior for Nick Foles, e.g. how does watching the league in general inform us what is likely to happen when Nick Foles throws on 1^{st} and 10 on the 20. As a sanity check, the probability of 0 yards gained (most likely an incomplete pass) is around 35%, which is reasonable.

We can do the same to estimate the distribution of yards gained by run, for which I will just use a league aggregate estimate because we want to isolate the effect of QB play. We’re almost there. The final ingredients we need relate to turnovers, the kicking game, and the “cost” of different field positions. I’ll spare some of the details, but just trust that I went through all of it carefully. In particular, I used many calculations by Brain Burke from Advanced Football Analytics, including the probability a team will punt/kick a field goal, the expected net punt distance, the field goal percentage, and the Expected Points all as a function of field position.

Now it’s time to make Foles and Bradford earn their money and play millions of plays. For each play, we first decide if it’s going to be a run or a pass based on flipping a coin that’s weighted by the pass probability at the current (down, distance, field position). We then roll a weighted die to see if there will be a turnover or a certain number of yards gained. Here are simulation results for 1 million drives starting at the 20 yard line for each QB as well as a generic, league average QB.

By this method, the league average QB scores 1.39 points per drive. When we include the Expected Points “given” to the other team when the other team gets the ball at a particular field position, we can say that the average QB has a net point gain of 0.39. Nick Foles is right around this average, whereas Sam Bradford is about 0.3 points lower. This would translate to about 4-5 points per game, which is the magnitude we would expect for QB play given Brian Burke’s calculations of Expected Points Added. As shown above, the decrease in points mainly results from an increased probability of punt and a decreased touchdown percentage.

What can we take away from all this? First off, I wouldn’t take the comparison too seriously. I included all of Bradford’s and Foles’ data equally. Realistically, we’d want to weight their more recent play higher. Also, the QBs historical stats are also dependent on the abilities of the other players on the field and no adjustment was made for that. All of this, however, can be accounted for. The biggest takeaway should be the process involved. We can use this approach to address almost any question. For instance, would you rather have a home-run back or a consistent power runner? If we increase our pass percentage by 5% overall, will we score more points? What personnel should we use for first and 10 versus second and five? All of these questions boil down to plugging in different distributions in our simulations and then clicking “run” on our computer. Pretty powerful stuff.

## 4 Comments