Difficulties Associated with Preseason Projections
Now that we have shared our preseason predictions in a series of previews, Patriot comes along with a bucket of cold water for our hot stove, a piece that discusses some issues with such forecasts.
The annual ritual of preseason prognostication of the final standings has been a source of enjoyment for baseball fans for many years. The rise of sabermetrics has infused a greater amount of rigor into the exercise, but even sabermetrically-inclined fans often misinterpret preseason team forecasts. Here are a few thoughts to keep in mind as prediction season kicks into high gear:
- Many events are essentially random from the preseason perspective—There are many factors that go into deciding a pennant race that are next to impossible to predict. The most obvious is key injuries. While some players can be assumed to have an elevated risk for injury, based on either personal history (we’re looking at you, Jose Reyes) or the nature of their role (pitchers in general), and thus should not be penciled in for a full season of playing time when developing team forecasts, it’s unreasonable to expect to predict the bulk of injuries before they occur. Less obvious events that can wreak havoc on expectations include PED suspensions, shifts in management priorities, and the sudden emergence of prospects. Analysts and fans trying to project Cleveland’s win total last year likely did not pay much attention to Danny Salazar, and for good reason, as he had pitched just 88 innings in the minors in 2012. But his performance down the stretch was crucial to the Indians earning a wildcard spot.
- If individual performance and playing time could be predicted perfectly, team win forecasts would be subject to a significant amount of random variation—If you take one thing away from this piece, let it be this: Even if one could perfectly estimate individual performance and playing time (and which teams would actually benefit from that performance), the standard error for preseason forecasts of team wins would be in the neighborhood of four games. This can be demonstrated by reviewing the accuracy of winning percentage estimators (such as the Pythagorean Theorem). When these models are given the exact number of runs scored and allowed by each team as inputs to estimate win-loss record, the standard error of these estimates is about four wins. For example, a team that scores 700 runs and allows 650 runs “should” win about 87 games, but 32% of the time we would expect fewer than 83 wins or more than 91.
Given the narrow margins by which playoff races are often determined (six of the ten playoff spots were decided by four games or less in 2013), random variation of this type would be likely to result in misidentification of playoff teams, even assuming the existence of impossibly accurate player forecasts.
Keep in mind that this thought exercise has artificially reduced the prediction error by assuming that with perfect forecasts for individuals we would have precise totals of team runs scored and allowed. However, just as wins do not always arise as expected from runs and runs allowed, neither do runs from their components, meaning that “perfect” player projections would result in slightly higher errors than assumed here.
- The format in which predictions are presented can disguise uncertainty, even if the person making the prediction recognizes the limitation of their estimates—A common format for predictions is picking the order of divisional standings. Picking a team first does not necessarily mean that one believes they are likely to win, though. In a very competitive division, it is not at all difficult to imagine a scenario in which the team most likely to win has an estimated probability of finishing first of well under 50%. Suppose we were able to determine that the best estimate probabilities for winning the AL West this year were Texas 35%, Oakland 30%, Los Angeles 20%, Seattle 13%, and Houston 2%. An ordinal standing prediction would predict the Rangers to win the division, even though we believe there is a 65% chance that Texas does not win the division.
This is not an issue as long as it is understood that we are predicting that Texas is the team most likely to win the division, but sometimes this nuance is lost on consumers of predictions. The problem gets worse when dealing with predictions for larger pools of team, such as picking the pennant or World Series winner, in which case no team ever has a true best estimate probability of victory exceeding 50%.
- Minimizing error requires narrower ranges of team records than will actually be observed—Many fans’ initial reaction to a set of forecasted standings is to point out that the spread in forecasted wins is unrealistically small. If the forecasts are done in a manner so as to minimize the standard error when predicting actual wins, these critics will be correct in one sense—the standard deviation of forecasted wins will be less than those from actual major league seasons. This is a necessary reflection of the uncertainty that surrounds such predictions.
Teams that win or lose a very large number of games tend to be those teams that are projected to be very good (or bad), but also benefit from positive (negative) random variation/chance/luck/whichever term is least offensive to your sensibilities. If there were many teams that had a true win expectation of 100, then we would expect to observe more cases in which some teams have good fortune and win more games than otherwise expected.
We also know that some teams will either exceed or underperform their projections. As discussed earlier, even with perfect information regarding player performance there would still be a non-negligible level of error in forecasts. But we don’t know who those teams will be. One could artificially increase the win projections of the best projected teams and reduce those of the worst projected teams to create a set of expected standings that more closely resembles the typical actual spread at the end of the season—but doing so would increase the standard error of team projections.
That approach, even if not done consciously or systematically, seems to be typical of mainstream writers. Many forecast Washington to win 100 games in 2013, coming off of their major league-leading 98 wins in 2012. While it would be foolish to set a hard and fast rule, a good rule of thumb is to never predict that a given team will win 100 games.
Suppose a team is coming off of a season in which they set the league record for regular season wins and proceeded to go 11-2 en route to winning the World Series for the second time in three years. Said team also had a string of five consecutive playoff appearances and had acquired the defending back-to-back Cy Young Award winner in the offseason. Surely this team would be a safe bet to win 100 games?
One would think – but they didn’t. The 1999 Yankees were a great team and won another World Series, but they won only 98 regular season games (also, I cheated a bit by counting 1994 as a playoff appearance, but they had the best record in the AL at the time of the strike with a 6.5 game divisional lead). Anecdotes do not prove the rule, of course, but perhaps they can provide caution where cold clinical terms like “regression to the mean” and “random variation” fail to do so.
Photo by Florian