I need to sum variables in two data sets and join them. I would like to do this in one SQL statement, however it is a one-to-many join. I am interested to learn if a summary variable can be created, for lack of a better description, using a SELECT statement.
The below code incorrectly calculates the summary variable for HOURS since there is only 1 record per name/date in INTERVAL, but multiple records per name/date in DETAIL.
I certainly could write multiple steps to accomplish this, but wanted to see if it can be accomplished in one SQL step. Thanks
Sample Code:
data Detail;
Length Name CallType $25;
input date mmddyy10. name $ calltype $ count;
Format date mmddyy10.;
datalines;
05/01/2014 John Order 5
05/01/2014 John Complaint 6
05/01/2014 Mary Order 7
05/01/2014 Mary Complaint 8
05/01/2014 Joe Order 4
05/01/2014 Joe Complaint 2
05/01/2014 Joe Internal 2
05/02/2014 John Order 6
05/02/2014 John Complaint 4
05/02/2014 Mary Order 9
05/02/2014 Mary Complaint 7
05/02/2014 Joe Order 3
05/02/2014 Joe Complaint 1
05/02/2014 Joe Internal 3
;
data Interval;
Length Name $25;
input date mmddyy10. name $ hours;
Format date mmddyy10.;
datalines;
05/01/2014 John 8
05/01/2014 Mary 6
05/01/2014 Joe 4
05/02/2014 John 8
05/02/2014 Mary 6
05/02/2014 Joe 4
;
PROC SQL noprint feedback;
CREATE TABLE SUMMARY AS
SELECT
D.Name
, Sum(D.Count) as Count
, Sum(I.Hours) as Hours
FROM Detail D, Interval I
WHERE D.Name=I.Name and D.Date=I.Date
GROUP BY D.Name
ORDER BY D.Name;
QUIT;