0

我想在报告过程中重写数据集的代码。

1)One带字段的数据集 ( name_id test_var)

2)test_var可以是正数、负数、零。

3)结果,我想要这样的表:

name_chr test_var_pos test_var_neg test_var_zero

/*这里的数据,按name_chr分组*/

TOTAL: SUM_POS SUM_NEG SUM_ZERO <- 毕竟我想要这个 TOTAL

4)附加:我有 name_id => name_chr 的字典

5)PS:name_id 包含点!!!(我希望它们在结果表中)

6)

data result(keep=name_chr test_var_pos test_var_neg test_var_zero);
  retain name_chr "";
  retain test_var_pos  0;
  retain test_var_neg  0;
  retain test_var_zero 0;
  set One;
  by name_id;  /*already sorted by name_id*/
  if(FIRST.name_id) then do;
    name_chr="";
    test_var_pos = 0;
    test_var_neg = 0;
    test_var_zero = 0;
  end;
  else do;
    if(test_var>0) then test_var_pos=test_var_pos + test_var;
    if(test_var<0) then test_var_neg=test_var_neg + abs(test_var);
    if(test_var=0) then test_var_zero=test_var_zero + 1; /*as example, test_var_zero is    a count*/

  end;
 if (LAST.name_id) then do; 
      %mFIND(name_id,name_chr);
      output;
 end;
 run;

我想在 proc 报告中重写此代码并将TOTAL行添加到此数据集。

所以,我做什么:

1) name_id 最后必须是组值。

2) 我必须计算 test_var 的所有正值、负值和零值。

 proc report data = One out = Result_table(keep= name_chr test_var_pos test_var_neg test_var_zero); 
 column name_id name_chr test_var test_var_pos test_var_neg test_var_zero;
 /*from One - table*/
 define name_id /group;
 define test_var / computed;
 /*to Result_table*/
 define name_chr / computed;
 define test_var_pos / computed;
 define test_var_neg / computed;
 define test_var_zero / computed;

 compute before name_id;  /*is this eq to FIRST?*/
    name_chr="";
    test_var_pos = 0;
    test_var_neg = 0;
    test_var_zero = 0;
 endcomp;
 compute test_var_pos;
    if(test_var>0) then test_var_pos = test_var_pos + test_var;
 endcomp;
 compute test_var_neg;
    if(test_var<0) then test_var_neg = test_var_neg + abs(test_var);
 endcomp;
 compute test_var_zero;
    if(test_var=0) then test_var_zero = test_var_zero + 1;
 endcomp;
 compute after name_id; /*is it eq to LAST?*/
    /*magic to get name_chr*/
    %mFIND(name_id,name_chr);
    /*output*/
 endcomp;
 rbreak after / summarize;
 compute after rbreak; 
   name_chr = "TOTAL: ";
 encomp;
 run;
4

2 回答 2

1

Proc report does not process a data set...it presents it. It can be in HTML, PDF, RTF, etc. The typical workflow is to prepare your data with a data step or a procedure. Then present your data with a report procedure (like Proc Report).

于 2012-05-22T11:52:56.933 回答
1

If you want the total line for presentation (CarolinaJay is right here) only, then you can massage the data so that proc report will create your totals for you.

With out any example dataset, I'm going to make the assumption that a name_id can be in your dataset multiple times with different test_vars.

Knowing that your going to use proc report to do the summaries, I'd probably do something like the following:

data prepped;
 set one;
 if test_var > 0 then test_var_pos = 1;
 else if test_var = 0 then test_var_zero = 1;
 else if test_var < 0 then test_var_neg = 1;
run;

You'd want to order the if / else if statements in order of the most occurring to possibly squeeze out some efficiency if you're dealing with a very large dataset.

From there, it's a matter of proc freq, proc summary, proc report, proc whatever will do want you want which is to condense the counts.

For a summary by name_id and then an overall summary, you could use proc report. Again, it may not be the best option depending on your dataset and your computing power.

Same logic as my previous answer using proc report:

proc report data = prepped out=summary nowd;
 col name_id test_var_pos test_var_zero test_var_positive;
 define name_id / group;
 rbreak after / summarize;
run;

Again, any of the other procs could get you the summaries by group, and then you could feed that data set into proc report to get that summary line on the bottom. Just exclude the /group on the name_id.

Getting comfortable with how the underlying data needs to be structured for each of the SAS procedures will go a LONG LONG LONG way in harnessing the power of the built in procs.

于 2012-05-22T23:38:05.517 回答