Wednesday, March 25, 2009

Statistical Modeling For Fantasy Baseball

The approach I'll take initially with Josh and the other participants initially will be to share the daily projections generated from my statistical model, and let them know in general terms what factors the model is taking into account. I'll probably wait to 'open the kimono' and share the actual spreadsheet where the statistical model is implemented until we've worked together for a few weeks, since I'll be sacrificing some of my own potential profit playing fantasy sports each time I share it with someone. I want to know that the people I share it with have the motivation and ability to help me optimize the model.

Since I think it's tough to make speculative decisions using a model you don't have confidence in, one of the first projects we'll work on is likely to go be to try to quantify the average error of my existing model's predictions each day, relative to other means of projecting daily performance. With hundreds of players in action every day, it should only take a week or two of data to get some pretty meaningful results. We can also supplement those results with backtesting using data from last year.

Once the Draftbug Millionaires are comfortable that the model is providing useful projections, they'll be able to work improving specific aspects of the model. There are MANY areas where it can be improved. Despite its being (apparently) much better than what anyone else was using for Rotohog last year, it really is painfully incomplete.

One of the many examples of how the participants could help improve the projections is that right now when I evaluate 'platoon advantage' (the benefit that hitters receive if they hit with the opposite handedness of the pitcher they're facing), I'm just making the same adjustment regardless of who the hitter or pitcher is. While that's accurate enough for hitters (since they all have similar platoon advantage/disadvantage over time), it's very inaccurate for pitchers, who do have long-term, sustainable differences in the degree of their platoon advantage. Maybe Josh will be the one to figure out how to incorporate this into the model. I can think of at least 20 or 30 other, similar improvements that can be made. How do we project a pitcher's innings pitched for a specific game? Can we track bullpen fatigure and reliever availability? Should we incorporate weather?

1 comment:

  1. All good questions. My first thought when I considered where to start for a system was platoon splits. Can't wait to get to work on this.

    ReplyDelete