0

我有一个 sas 数据集,其中包含来自 5000 名受访者的调查的 5000 行和 150 个变量,但我需要删除整行/受访者,其中列缺少对 150 个变量中的任何一个的观察。所以基本上,我只需要那些完成了所有 150 个变量的答案的受访者。

我正在使用 proc sql 或 base sas,但我无法想出更简单的方法来做到这一点。我使用了条件查询,但是有些列是数字的,有些是字符类型的,我还需要对数字列进行分析,所以转置似乎不是一种选择。任何帮助将不胜感激?

谢谢

4

3 回答 3

2

使用数据步骤它只是:

data want;
  set have;
  if cmiss(of _all_) = 0;
run;

将处理字符和数字变量。

于 2014-09-17T23:31:35.523 回答
0

SAS procs 倾向于通过从正在分析的数据中删除整行来忽略缺失值。所以,这可能没有你想象的那么严重。也就是说,如果您正在进行前向选择逻辑回归,添加一堆变量,那么只有那些列没有缺失值的行才会被处理。

如果要创建列没有缺失值的新数据集,可以执行以下操作:

proc sql;
    create table t_nomissing
        select t.* 
        from t
        where col1 is not null and col2 is not null and col3 is not null and . . .
              col150 is not null;

如果您有列名列表,我建议where您在 Excel 等工具中创建子句,您可以在其中使用公式并将它们复制下来。

于 2014-09-17T13:46:14.337 回答
0

仅使用 SAS 就将 Gordon Linoff 关于 Excel 的想法更进一步......

ods output SQL_Results=appliance;
proc sql number;
select * from sashelp.applianc;
quit;


data appliance_2;
  set appliance;
  if cmiss(of _all_) = 0;
run;


proc sql; create table que as select * from dictionary.columns where libname = "WORK" and memname = "APPLIANCE"; quit;


proc sql ; 
select name, "IS NOT NULL AND"
from dictionary.columns where libname = "WORK" and memname = "APPLIANCE"; 
quit;

*copy / paste / clean-up ;
proc sql; 
create table appliance_3 as
select * from appliance 
where
Row IS NOT NULL AND 
units_1 IS NOT NULL AND 
units_2 IS NOT NULL AND 
units_3 IS NOT NULL AND 
units_4 IS NOT NULL AND 
units_5 IS NOT NULL AND 
units_6 IS NOT NULL AND 
units_7 IS NOT NULL AND 
units_8 IS NOT NULL AND 
units_9 IS NOT NULL AND 
units_10 IS NOT NULL AND 
units_11 IS NOT NULL AND 
units_12 IS NOT NULL AND 
units_13 IS NOT NULL AND 
units_14 IS NOT NULL AND 
units_15 IS NOT NULL AND 
units_16 IS NOT NULL AND 
units_17 IS NOT NULL AND 
units_18 IS NOT NULL AND 
units_19 IS NOT NULL AND 
units_20 IS NOT NULL AND 
units_21 IS NOT NULL AND 
units_22 IS NOT NULL AND 
units_23 IS NOT NULL AND 
units_24 IS NOT NULL AND 
cycle IS NOT NULL 
;quit;
于 2015-12-21T20:32:59.803 回答