By Sam Waters

Playing time projections are a key component of forecasting MLB player performance. There are two factors that influence playing time, skill and opportunity. A player’s “playing time skill” is how often he is available for placement in the lineup. The player’s injury-proneness, need for rest days, and proclivity for suspension determine this level of skill. A player’s “playing time opportunity” is how often his team puts him in the lineup when he is available. The player’s rate of production, position, and the quality of his teammates at that position all influence this level of opportunity. Projection systems that give a single playing time estimate in terms of plate appearances do not distinguish between these two components.

This entanglement of skill and opportunity gives us a projection that conveys less useful information. A forecast of 200 plate appearances for Willie Bloomquist tells us what to expect if Bloomquist’s “playing time skill” and “playing time opportunity” develop as they normally do for players like him. But what happens if Bloomquist’s manager suddenly promotes him from the bench to a starting position? We need to adjust Bloomquist’s playing time accordingly for this news, but the appropriate magnitude of that adjustment is unclear. If Bloomquist’s team has 600 PA available at his position and the original forecast was grounded in the expectation that Bloomquist would get all of the opportunities (600 PA) and be available one third of the time (200 PA), the manager’s announcement should not affect our projection. If the forecast was grounded in the expectation that Bloomquist would get one third of the opportunities (200 PA) and be available all of the time (600 PA), the manager’s announcement should lead to a dramatic increase (from 200 PA to 600 PA) in Bloomquist’s projected playing time. Unless we know how each factor contributed to the initial forecast, it is difficult to fully incorporate new information into our evolving expectations.

Singular playing time estimates are especially limiting from the perspective of a team making a roster decision. The team already controls the opportunity aspect of playing time projection; they are only concerned with skill, because they need to know how often a player will be available when given those opportunities. A playing time estimate that mixes together the information the team needs and the information that it doesn’t need is a less useful estimate. Teams need to find a way to strip the effect of opportunity from this projection so that they can look solely at playing time expectations in terms of availability.

One measurement of “playing time skill” that we could use is the percentage of days on the team’s major league roster that a player is available to play. Pro Sports Transactions has an extensive record of each players’ games missed due to injury, personal reasons, and disciplinary actions. While the need for rest days does vary among players and should be considered part of “playing time skill”, Pro Sports Transactions does not include them because it is very difficult to define and measure a rest day. We might attempt to operationalize the definition of a rest day as an uninjured player not in the lineup who previously played at least x% of the time against pitchers of that handedness in the team’s last n games. We could also scrape news reports and press releases online for mentions of rest days since managers often announce these ahead of game time. Even if we are unable to define and truly capture the effect of rest days, the other information that we have now should account for the vast majority of league-wide unavailability. This unavailability rate enables us to isolate the skill component of playing time to a certain extent, but we still would not be able to directly compare the “playing time skill” of players with different levels of opportunity.

A player that receives all of his team’s opportunities and has a 90% availability rate is very different from a player that receives half of his team’s opportunities and has a 90% availability rate. It should be easier to maintain health in a less demanding part-time role than it is in a full-time role. In order to compare the two players in terms of “playing time skill”, we would have to find the causal effect on availability rate of moving between the two roles. If we take the average causal effect of shifting from part-time duty to full-time duty and add it to the part-time player’s availability rate, this should allow us to estimate how often the part-time player would be available if given a full-time job. With this estimate in tow, now it would be appropriate to directly compare the availability rates of our two players, in order to determine which one has more “playing time skill”.

Since a team controls decisions about opportunity, the method above should complete the informational picture the team needs to make informed roster decisions based on future playing time projections. Third parties like fantasy baseball players and independent analysts don’t control those decisions regarding opportunity, but they can still use predictive analysis to estimate rates of opportunity and combine them with estimates of availability to produce a comprehensive statement about a player’s projected playing time.

My hope is that, ultimately, forecasters will move away from forecasts that implicitly make statements like, “We project Mike Carp to get 250 at-bats given that he continues to display similar health, receive similar opportunities, and develop just as similar players did before him.” Instead, they could be telling us the likely availability of a player in each possible role, along with the likelihood that the player actually finds himself in each of those roles. We could express this type of projection in a table like this:

I broke “playing time opportunity” up into three arbitrary categories here for the sake of brevity. We could represent opportunity and role on a continuous scale, but doing so might convey too much information to be useful. A summary such as the one in this table gives a more complete picture of a player’s projected playing time than the single-value estimates currently in vogue, without sacrificing too much simplicity. Some might think that even this playing time estimate is still too multidimensional- in that case I would suggest representing playing time skill as a player’s projected availability given 600 PA opportunities. The more meaningful structure of this estimate should improve the ability of analysts and teams to capture the stable, skill-based aspects of playing time in their projections.

What is the point of this article? To assert that playing time projections are both important and difficult to do? I’m curious where one would find someone to take this other side of this argument.

Indeed time projection is important. That’s clearly stated.

This might be a better approach than using just one model for playing time. Could be useful with a zero-inflated nested model: some model for “playing time skill” nested inside a logit model for “playing time opportunity.” Do you guys want to collaborate on something like that?