Pitching and Defense in the Ivy League

By Mikeal Joseph Parsons

Since Henry Chadwick created the box score at the turn of the twentieth century, baseball, perhaps more than any other sport, has been the subject of statistical analysis as a basis to improve performance.  With stats kept for batting average, on-base percentage, earned run average, slugging percentage–just to name a few–baseball has been a statistician’s dream. With the rise of Sabermetrics in the 1990s, statistics in Major League Baseball took on a new role in predicting future performance of teams and players and influenced in-game decisions and the evaluation and recruiting of players. But these metrics have not yet been systematically applied to Ivy League baseball. So I thought, what do the statistics of past Ivy League play tell us about the upcoming season and especially how can my school, Harvard, improve on its sub-.500 record in league play and make a run at the playoffs?

 Last season Harvard was 7-13 in conference but had a positive run differential of 21 runs, outscoring their opponents 119 to 98. The three other teams in the Ivy League, with positive run differentials, Penn, Dartmouth, and Columbia, all went 16-4 and were competing in the playoffs at the end of the year. So why the disparity between Harvard and the others? The answer is found clearly in defense.

 The popularity and success of baseball analytics, as recounted in the popular book and movie Moneyball, lifted it from obscurity to national prominence. But these statistics were gathered by and for, and applied to baseball at its highest level, the Major Leagues. It was not clear if the principles of Moneyball (Sabermetrics) could be applied to college baseball, and specifically, the Ivy League. Initial forays into this discipline have proven fascinating, and the importance of pitching is clearly apparent. A common metric used for predicting baseball wins is predicting how many more points (runs) better than the league average a team needs to be to get one win. Assuming a team is (perfectly) average they would have a win percentage of .500 and would score and allow the exact same number of runs in a given season. A team can be better than the league average in runs scored (offense) or runs allowed (defense) or both. In the Major Leagues, one additional win has been equal to about 10 runs above the average on offense and 10 runs above the average on defense, with only minimal fluctuations annually over the past 5 years. This means for every 10 additional runs scored by the offense or prevented by the defense the team will be about that many wins above .500.

 In 2015, in the Ivy League, the average number of runs scored for a team was 101. However, in the Ivy League (unlike the MLB), there was not the same correlation between team wins and runs scored/allowed. Rather in 2015, 14.5 runs scored on offense above the league average produced one additional team win (and 29 runs above the league average would produce two additional team wins, etc.). However an Ivy League team needed to prevent only 11.5 runs on defense to produce an additional win.

In the Ivy League, it takes more runs scored than runs allowed to produce wins. Or conversely, the fewer runs allowed, the greater the impact on a team’s win-loss record. In other words, preventing runs in 2015 Ivy League play was roughly 20% more effective in winning baseball games than scoring runs was. Take Ivy League champs and NCAA post-season representative, Columbia, as an example. Columbia finished 16-4 in league play. Columbia scored 147 runs in 2015 Ivy League play, 46 runs above the average and allowed 64 runs, 37 runs below the league average. According to the metrics above, those 46 runs were worth approximately 3 wins (47 divided by 14.5 = 3.17 runs).  The 37 runs prevented, even though they represent a 20% less differential than runs scored (47), were also worth about three team wins (37 divided by 11.5 = 3.2 wins). This baseball metric models a team that would finish with six team wins above .500, and, in fact, Columbia finished with a 16-4 record (or six wins above a 10-10 team with an equal number of runs scored for and against).

Team

Year

Wins

Losses

Runs Scored

Runs Allowed

Score Ratio

Predicted Win %

Actual Win %

% Error

Harvard

2015

7

13

119

98

1.214

0.579

0.350

0.229

Dartmouth

2015

16

4

103

63

1.635

0.692

0.800

0.108

Yale

2015

6

14

73

153

0.477

0.228

0.300

0.072

Cornell

2015

9

11

103

119

0.866

0.441

0.450

0.009

Princeton

2015

4

16

49

110

0.445

0.208

0.200

0.008

Columbia

2015

16

4

147

64

2.297

0.798

0.800

0.002

Brown

2015

6

14

89

148

0.601

0.302

0.300

0.002

Penn

2015

16

4

125

54

2.315

0.800

0.800

0.000

An even starker contrast to Harvard can be seen with Harvard’s division rival, Dartmouth. Dartmouth scored 103 runs, about the league average, but allowed only 63. According to these statistics Dartmouth should have been about 5.5 games above .500. At 16-4 the final game was a tossup and their record accurately reflects that.

Last season Harvard was 7-13 in conference but had a positive run differential of 21 runs, outscoring their opponents 119 to 98. This run differential should have produced a team with a record of 11-9 or perhaps 12-8. So according to the Ivy League metric, of course, Harvard’s record is still an anomaly, since no team should be below .500 with a positive 21 run differential. However, it should be noted that using these metrics, the predicted winning percentage matched actual win percentage to the hundredths of a percentage point for every other team in the Ivy League. No team in the last 5 seasons of Ivy League play has had the same disparity between predicted win-loss record and actual win-loss record as Harvard. Thus, this anomaly is used to highlight the importance of pitching and does help explain partially why their record was so different than that of the league champs, Columbia.

To prove 2015 as a whole was not an outlier year, it should be noted that in each of the last 5 years, the three teams that have allowed the fewest number of runs, have finished in the top three in the Ivy League. It follows then that if the same metrics were applied to the teams in each of the past 5 seasons, one would find that each year pitching and defense were more effective ways of winning games than hitting and scoring runs. The 5-year average was 12 runs scored above the league average for 1 win and 10 runs prevented below the average for one win. Of course, a team has to score runs to win games, but in the past five years of Ivy League competition, runs prevented was more effective than runs scored when looking at team wins.

Harvard, as well as any other team with Ivy League title aspirations, will need a strong pitching staff with the ability to temper the success of Ivy League hitters. A difficult task given the four games in two days that is customary of Ivy League conference competition.

About the author

harvardsports

View all posts

2 Comments

  • Fair amount of this just has to do with small sample size, too. Even over 162 in MLB, there’s generally a big “luck” variable that doesn’t even out; over just 20 in Ivy League, that’s even more pronounced.

    Harvard scored 56 of its 119 runs (47%) in just 4 games last year, all of them wins. Absent those blowout wins, their Pyth stats are much more pedestrian. Yes, pitching/defense were important (as they always are), but the feast-or-famine offense was at least as much to blame.

    • Absolutely but that bodes more for the application of the variables. Currently applying the algorithims has the Harvard 2015 team as the least consistent with this defense theory of any Ivy League team in the last 5 years. If you were able to account for these anomalies they would fit in line much more with the theory that preventing runs is more effective than scoring them, at least in the Ivy League.

Leave a Reply

Your email address will not be published. Required fields are marked *