1

ESN 是一个 id 列,每个 esn 有多个观察值,因此会出现重复的 esn 值。对于给定的 esn,我想找到最早的服务开始日期(并首先调用它),并且我想找到正确的结束日期(最后调用)关于如何选择“last”的 if/then 语句是正确的,但是运行以下代码时出现以下错误:

340        first = min(of start(*));
               ---
               71
ERROR 71-185: The MIN function call does not have enough arguments.

这是我使用的代码

data three_1; /*first and last date created ?? used to ignore ? in data*/
set three;
format first  MMDDYY10. last  MMDDYY10.;
by esn;
array start(*)  service_start_date;
array stop(*) service_end_date entry_date_est ;
do i=1 to dim(start);
  first = min(of start(*));
end;
do i=1 to dim(stop);
  if esn_status = 'Cancelled' then last = min(input(service_end_date, MMDDYY10.), input(entry_date_est, MMDDYY10.));
  else last = max(input(service_end_date, MMDDYY10.), input(entry_date_est, MMDDYY10.));
end;
run;

“esn” “service_start_date” “service_end_date” “entry_date_est” “esn_status”

1 10/12/2010 01/01/2100 10/12/2012 取消

1 05/02/2009 02/12/2010 10/09/2012 取消

1 04/05/2011 03/04/2100 10/02/2012 取消

结果应该是 first= 05/02/2009 和 last=10/12/2012

4

2 回答 2

1

数组和 min()、max() 等函数在数据集的行上水平操作,而不是在多个记录上垂直操作。

假设 esn_status 对于给定的 esn 是恒定的,那么您需要按 esn 和 service_start_date 对输入进行排序。您可以使用数据步骤来收集所需的值。

data three; /*thanks Joe for the data step to create the example data*/
length esn_status $10;
format service_start_date service_end_date entry_date_est MMDDYY10.;
input esn (service_start_date service_end_date entry_date_est) (:mmddyy10.) esn_status $;
datalines;
1 10/12/2010 01/01/2100 10/12/2012 cancelled
1 05/02/2009 02/12/2010 10/09/2012 cancelled
1 04/05/2011 03/04/2100 10/02/2012 cancelled
;;;;
run;

proc sort data=three;
by esn service_start_date;
run;

data three_1(keep=esn esn_status start last);
set three;
format start last date9.;
by esn;
retain start last;
if first.esn then do;
    start = service_start_date;
    last = service_end_date;
end;

if esn_status = "cancelled" then
    last = min(last,service_end_date,entry_date_est);
else
    last = max(last,service_end_date,entry_date_est);

if last.esn then
    output; 
run;
于 2013-09-07T22:33:19.200 回答
0

DoW 循环将到达您想要的位置,或者您可以在 SQL 中执行此操作。据我所知,您想要的结果与实际结果不匹配,因此您可能需要进行一些调整。对于未取消的人,您需要第二个 WANT 数据集,我认为没有一种简单的方法可以将它放在一个数据步骤中。

data have;
length esn_status $10;
format service_start_date service_end_date entry_date_est MMDDYY10.;
input esn (service_start_date service_end_date entry_date_est) (:mmddyy10.) esn_status $;
datalines;
1 10/12/2010 01/01/2100 10/12/2012 cancelled
1 05/02/2009 02/12/2010 10/09/2012 cancelled
1 04/05/2011 03/04/2100 10/02/2012 cancelled
;;;;
run;
data want_cancelled;
first = 99999;
last = 99999;
do _n_ = 1 by 1 until (last.esn);
 set have(where=(esn_status='cancelled'));
 by esn;
 first = min(first,service_start_date);
 last = min(last,service_end_date,entry_date_est);
end;
output;
keep first last esn;
format first last mmddyy10.;
run;
于 2013-09-07T15:11:13.537 回答