All,
I have a table that looks like this:
Date Pitcher WHIP
-------- -------------- -----
7/4/12 JACKSON, E 1.129
7/4/12 YOUNG, C 1.400
7/4/12 CORREIA, K 1.301
7/4/12 WOLF, R 1.594
...
6/28/12 JACKSON, E 1.137
6/27/12 YOUNG, C 1.750
...
6/19/12 JACKSON, E 1.215
6/17/12 YOUNG, C 1.851
I've set up a SQLFiddle here: http://sqlfiddle.com/#!2/addfe/1
In other words, the table lists the starting pitcher for every game of the MLB season, along with that pitcher's current WHIP (WHIP is a measure of the pitcher's performance).
What I'd like to obtain from my query is this: how much has that pitcher's WHIP changed in the last 30 days?
Or, more precisely, how much has that pitcher's WHIP changed since his most recent start that was at least 30 days ago?
So, for example, if E. Jackson's WHIP on 7/4/12 was 1.129, and his WHIP on 6/3/12 was 1.500, then I'd like to know that his WHIP changed by -0.371.
This is easy to figure out for any individual, but I want to calculate that for all pitchers, on all dates.
One of the things that makes this tricky is that there isn't data for every date. For example, if E. Jackson pitched on 7/4/12, the most recent start that's at least 30 days ago might be on 5/28/2012.
However, for K. Correia, who also pitched on 7/4/12 - his most recent start that's at least 30 days ago might be 5/26/2012.
I'm assuming that I need to join the table to itself, but I'm not sure how to do it.
Here's my first stab:
select
t1.home_pitcher,
t1.date,
t1.All_starts_whip,
t2.All_starts_whip
from
mlb_data t1
join
mlb_data t2
ON
t1.home_pitcher = t2.home_pitcher
and
t2.date = (select max(date) from mlb_data t3 where t3.home_pitcher = t1.home_pitcher and t3.date < date_sub(t1.date, interval 1 month))
This seems to work (and hopefully illustrates what I'm trying to capture), but takes HORRENDOUSLY long - my table goes back a few seasons, and has about 6,250 rows - and this query took 7,289 seconds (yes, that's correct - more than 2 hours). I'm sure this is a classic case of the absolute worst way to right a query.
[UPDATE] Some clarification...
The query should produce a value for EACH pitcher for EACH start.
In other words, if E. Jackson pitched in 10 games, he'd be listed in the result set 10 times.
Date Pitcher WHIP WHIP_30d_ago
-------- -------------- ----- ------------
7/4/12 JACKSON, E 1.129 1.111
...
5/18/12 JACKSON, E 1.111 2.222
...
4/14/12 JACKSON, E 2.222 3.333
In other words, I'm looking for a 30-day trailing WHIP for each start.
Many thanks in advance!