我一直在寻找相关的问题,但到目前为止我还没有运气。我正在寻找转换一长串自变量以进行回归分析。一个虚拟数据集如下所示:
DATA TEST (DROP = i);
DO i = 1 to 4000;
VAR = i + 100000;
output;
end;
run;
PROC TRANSPOSE
DATA = TEST
OUT = TEST_T
(DROP = _NAME_)
PREFIX = X_;
ID VAR;
VAR VAR;
RUN;
DATA TEST_ARRAY;
SET TEST_T;
ARRAY X[*] X_:;
DO J = 1 TO 40;
DO I = 1 TO DIM(X);
X[I] = RANUNI(0)*I;
OUTPUT;
END;
END;
RUN;
在这种情况下,变量名称 X_i 单调增加,实际上,我的变量实际上是 X_number ,其中数字是六位数的唯一标识符。我一直在尝试记录所有这些变量的变换和平方,以便我有一个新的 X 矩阵,其中包含以下列
X_133456 X_SQ_133456 LOG_X_133456
我尝试通过这样的所有变量循环列表
PROC CONTENTS
DATA = TEST_ARRAY
OUT = CONTENTS;
RUN;
PROC SQL NOPRINT;
SELECT NAME INTO: REG_FACTORS
SEPARATED BY " "
FROM CONTENTS;
QUIT;
DATA WANT;
SET TEST_ARRAY;
%LET index = 1;
%DO %UNTIL (%SCAN(®_factors.,&index.," ")=);
%LET factors = %SCAN(®_factors.,&index.," ");
LOG_X_&FACTORS. = LOG(X_&FACTORS.);
X_SQ_&FACTORS. = (X_&FACTORS.) ** 2;
%LET index = %EVAL(&Index + 1);
%END;
RUN;
但这会炸毁我的服务器,我需要找到一种更有效的方法,在此先感谢
编辑:对于贡献者 - 我设法在 13:04 解决
%LET input_factors = X_:;
PROC SQL;
SELECT
NAME
, TRANWRD(NAME,%SCAN(&input_factors.,1,'_'),'SQ')
, TRANWRD(NAME,%SCAN(&input_factors.,1,'_'),'LOG')
INTO
:factor_list separated by " "
, :sq_factor_list separated by " "
, :log_factor_list separated by " "
FROM
contents
WHERE
VARNUM < 5
WHERE
NAME LIKE "%SCAN(&input_factors.,1,'_')_"
ORDER BY
INPUT(SCAN(NAME,-1,'_'),8.)
;
QUIT;
%PUT &factor_list.;
%PUT &sq_factor_list.;
%PUT &log_factor_list.;