1

所以这就是问题所在。

我有一个数据集,对于每条记录,根据条件,我想加载不同的哈希。我不知道我将在运行时加载的每个哈希的确切哈希结构。所以我希望能够definedata有条件地执行该语句。但是由于不知道hash结构,所以想到了definedata通过变量将参数传递给语句,但是还是不行。我怎样才能做到这一点?这是我到目前为止所拥有的:

/* Hashes have the same key field */
data hash1;
  key = '1';  a = 10; b = 20; output;
  key = '2';  a = 30; b = 40; output;
run;

/* Hash objects can have different data members and  types */
data hash2;
  key = '1';  x = 'AAA'; y = 'BBB'; output;
  key = '2';  x = 'CCC'; y = 'DDD'; output;
run;

/* This the dataset I want to process */
/* hid specifies which hash I should lookup */
/* key contains the key value to use for the lookup */
/* def is the hash data definition piece of the hash. 
   In practice I will use another hash to retrieve this definition
   But for simplicity we can assume that is part of the have dataset itself */

data have;
  hid = '1'; key = '2'; def = "'a', 'b'"; output;
  hid = '2'; key = '1'; def = "'x', 'y'"; output;
run;

/* This is what I want */

data want;
  set have;

  /* Though I don't know the structure of each hash, I can get a list of all hashes at the onset via some macro processing. So this statement is doable */
  if _N_ = 0 then set hash1 hash2;

  /* This part is OK. The hash declaration is able to accept a variable for the dataset name */

  hashname = "hash" || hid;
  declare hash hh(dataset: dsname);
  hh.definekey('key');

  /* The following line is the problematic piece */
  hh.definedata(def);

  hh.definedone();

  rc = hh.find();
  /* Do something with the values */

  /* Finally delete the object so that it can be redefined again on the next record */
  hh.delete();

run;

我得到的错误是:ERROR: Undeclared data symbol 'a', 'b' for hash object。我认为这里的问题是,defineddata 方法会一一解析变量,最终将整个字符串'a', 'b'视为一个变量。

如果我将散列定义为所有可能变量的超集,那么当我加载包含这些变量子集的数据集时,它就会报错。此外,我不能将散列定义为包含所有变量的超集(即,我不能创建所有散列来包含 a、b、x 和 y 并遗漏无关元素)。

所以我的问题是我怎样才能完成我在这里尝试做的事情?是否可以仅使用 datastep 构造逐个提供每个变量来进行宏 %do 之类的迭代?还是有其他方法可以做到这一点?

约束

  1. 我不能依赖宏处理,因为我只知道在运行时要使用哪个哈希。
  2. 由于内存原因,我无法提前加载所有定义。

任何帮助将不胜感激。

4

2 回答 2

3

您可以将哈希引用存储在单独的哈希中。这称为hash of hashes使用对在步骤开始时仅加载一次的单个散列的引用来加载散列的散列。

例子:

data hash1;length key $1;input
key a b; datalines;
1 10 20
2 30 40
3 50 60
4 70 80  
run;

data hash2;length key $1;input
key x $3. y: $3.; datalines;
1 AAA BBB
2 CCC DDD
3 EEE FFF
4 GGG HHH
run;

data hashdataspec; length hid $1;input
hid datavars&: $15.;datalines;
1   a,b
2   x,y
run;

data have;
  do rowid = 1 to 100;
    p = floor (100*ranuni(123));
    q = 100 + ceil(100*ranuni(123));

    length r s $15;
    r = scan ("One of these will become the R value", ceil(8*ranuni(123)));
    s = scan ("How much wood would a woodchuck chuck if ...", ceil(9*ranuni(123)));

    length hid key $1;
    hid = substr('12',   ceil(2*ranuni(123)));
    key = substr('1234', ceil(4*ranuni(123)));

    output;
  end; 
run;

data want;
  sentinel0 = ' ';
  if 0 then set hash1-hash2 hashdataspec; * prep pdv for hash host variables;
  sentinel1 = ' ';

  * prep hashes, one time only;
  if _n_ = 1 then do;
    * load hash data specifiers;
    declare hash hds(dataset:'hashdataspec');
    hds.defineKey('hid');
    hds.defineData('hid', 'datavars');
    hds.defineDone();

    * prep hash of hashes;
    declare hash h;      /* dynamic hash that will be added to hoh */
    declare hash hoh();  /* hash of hashes */
    hoh.defineKey ('hid');
    hoh.defineData ('h');
    hoh.defineDone();

    * loop over hashdataspec, loading dynamically created hashes;
    declare hiter hi('hds');
    do while(hi.next() = 0);
      h = _new_ hash(dataset:cats('hash',hid));    * create dynamic hash;
      h.defineKey('key');
      do _n_ = 1 to countw(datavars);
        h.defineData(scan(datavars,_n_,','));      * define data vars, one at a time;
      end;
      h.defineDone();
      hoh.add();  * add the dynamic hash to the hash of hashes;
    end;
  end;

  * clear hash host variables;
  call missing (of sentinel0--sentinel1);

  set have;

  * lookup which hash (hid) to use
  * this will select the appropriate dynamic hash from hoh and update hash variable h;
  hoh.find();

  * lookup data for key in the hids hash;
  h.find();

  drop datavars;
run;
于 2019-07-25T11:38:29.860 回答
2

您的程序可以运行,但我认为性能会很差。

请注意,我更改了 DEF 的值,以便 SCAN 更容易。

data have;
   hid = '1'; key = '2'; def = "a b"; output;
   hid = '2'; key = '1'; def = "x y"; output;
   run;

/* This is what I want */

data want;
   if _N_ = 0 then set hash1 hash2;
   call missing(of _all_);
   set have;
   hashname = "hash" || hid;
   declare hash hh(dataset: hashname);
   hh.definekey('key');
   /* The following line is the problematic piece */
   length v $32;
   do i = 1 by 1;
      v = scan(def,i,' ');
      putlog v= i=;
      if missing(v) then leave;
      *hh.definedata(def);
      hh.definedata(v);
      end;
   hh.definedone();
   *hh.output(dataset: cats('X',hashname));

   rc = hh.find();
   /* Do something with the values */

   /* Finally delete the object so that it can be redefined again on the next record */
   hh.delete();
   run;
于 2019-07-24T22:51:42.417 回答