3

我有一个数据集,其中患者可以对某些变量有多个(和未知)值,最终看起来像这样:

    ID   Var1   Var2   Var3   Var4
    1    Blue   Female 17     908
    1    Blue   Female 17     909
    1    Red    Female 17     910
    1    Red    Female 17     911
...
    99   Blue   Female 14     908
    100  Red    Male   28     911

我想将这些数据打包,以便每个 ID 只有一个条目,并指示其原始条目中的一个值是否存在。因此,例如,这样的事情:

ID   YesBlue   Var2      Var3   Yes911
1    1         Female    17     1
99   1         Female    14     0
100  0         Male      28     1

在 SAS 中有没有直接的方法来做到这一点?或者失败了,在我不知道如何使用的 Access(数据来自哪里)中。

4

4 回答 4

3

如果您的数据集称为 PATIENTS1,可能是这样的:

proc sql noprint;
  create table patients2 as
  select *
        ,case(var1)
           when "Blue" then 1
           else 0
         end as ablue
        ,case(var4)
           when 911 then 1
           else 0
         end as a911
        ,max(calculated ablue) as yesblue
        ,max(calculated a911) as yes911
  from patients1
  group by id
  order by id;
quit;

proc sort data=patients2 out=patients3(drop=var1 var4 ablue a911) nodupkey;
  by id;
run;
于 2013-01-09T09:32:17.450 回答
2

这是一个数据步骤解决方案。我假设对于给定 ID,Var2 和 Var3 的值始终相同。

data have;
input ID Var1 $ Var2 $ Var3 Var4;
cards;
1    Blue   Female 17     908
1    Blue   Female 17     909
1    Red    Female 17     910
1    Red    Female 17     911
99   Blue   Female 14     908
100  Red    Male   28     911
;
run;

data want (drop=Var1 Var4 _:);
set have;
by ID;
if first.ID then do;
    _blue=0;
    _911=0;
end;
_blue+(Var1='Blue');
_911+(Var4=911);
if last.ID then do;
    YesBlue=(_blue>0);
    Yes911=(_911>0);
    output;
end;
run;
于 2013-01-09T09:39:55.060 回答
1

编辑:看起来就像基思所说的一样,只是写法不同。

这应该这样做:

data test;
input id Var1 $ Var2 $ Var3 Var4;
datalines;
1    Blue   Female 17     908
1    Blue   Female 17     909
1    Red    Female 17     910
1    Red    Female 17     911
99   Blue   Female 14     908
100  Red    Male   28     911
run;

data flatten(drop=Var1 Var4);
set test;
retain YesBlue;
retain Yes911;
by id;

if first.id then do;
  YesBlue = 0;
  Yes911 = 0;
end;

if Var1 eq "Blue" then YesBlue = 1;
if Var4 eq 911 then Yes911 = 1;

if last.id then output;
run;
于 2013-01-09T09:44:50.427 回答
1

PROC SQL非常适合这样的事情。这类似于 DavB 的答案,但消除了额外的排序:

data have;
input ID Var1 $ Var2 $ Var3 Var4;
cards;
1    Blue   Female 17     908
1    Blue   Female 17     909
1    Red    Female 17     910
1    Red    Female 17     911
99   Blue   Female 14     908
100  Red    Male   28     911
;
run;

proc sql;
  create table want as
  select ID
       , max(case(var1)
               when 'Blue'
               then 1
               else 0 end) as YesBlue
       , max(var2)         as Var2
       , max(var3)         as Var3
       , max(case(var4)
               when 911
               then 1
               else 0 end) as Yes911
  from have
  group by id
  order by id;
quit;

它还可以通过 ID 变量安全地减少您的原始数据,但如果来源与您描述的不完全一样,则可能会出现错误。

于 2013-01-09T14:14:58.167 回答