这可能对你有用——我有 9.2 和 9.2 有一些显着的哈希改进,但我认为我表现得很好,只使用了 9.1 中的内容。您可以尝试将其交叉发布到 SAS-L [SAS listserv],因为我仍然相信 Paul Dorfman(即 The Hash Guru)所读到的。
我以为你想把“剩菜”贴出来。如果它没有按照您想要的方式工作,您可能需要处理该部分。这没有经过很好的测试,它适用于您的示例数据集。我称缺少 24 和 25 的商品,因为它们不用于那个。
我很确定有一种比我做的迭代更干净的方法,但是由于我使用的是 9.2+ 并且我们有可用的多数据,所以我一直使用它而不是哈希迭代器,所以我不知道更清洁的方法。
data have;
input ID_client ID_commodity Charge;
datalines;
1 111111111 100
1 222222222 200
2 333333333 300
2 444444444 400
2 555555555 50
;;;;
run;
data for_hash;
input ID_client_hash ID_ofpayment paymentValue;
datalines;
1 11 50
1 12 50
1 13 100
1 14 50
1 15 50
2 21 500
2 22 200
2 23 100
2 24 200
2 25 200
;;;;
run;
data want;
*Create hash and hash iterator - must use iterator since 9.1 does not allow multidata option;
if _n_ = 1 then do;
format id_client_hash paymentValue id_ofpayment BEST12.;
declare hash h(dataset:'for_hash' , ordered: 'a');
h.defineKey('ID_client_hash','id_ofpayment'); *note I put id_client_hash, renaming the id - want to be able to compare them;
h.defineData('id_client_hash','id_ofpayment','paymentValue');
call missing(id_ofpayment,paymentValue, id_client_hash);
h.defineDone();
declare hiter hi('h');
end;
do _t = 1 by 1 until (last.id_client);
set have;
by id_client;
*Iterate through the hash and find the first record with the same ID_client;
do rc = hi.first() by 0 while (rc eq 0 and ID_client ne ID_client_hash);
rc = hi.next();
end;
*For the current charge record, iterate through the payment (hash) until all paid up.;
do while (charge gt 0 and rc eq 0 and ID_client=ID_client_hash);
if charge ge paymentValue then do; *If charge >= paymentvalue, use up the payment value;
value = paymentValue; *so whole paymentValue is value;
charge = charge - paymentValue; *charge is decremented by paymentValue;
output; *output row;
_id=ID_client_hash;
_pay=id_ofpayment;
rc = hi.next();
h.remove(key:_id,key:_pay); *remove payment row from hash now that it has been used up;
end;
else do; *this is if (remaining) charge is less than payment - we will not use all of the payment;
value = charge; *value is the remainder of the charge, ie, how much of payment was actually used;
paymentValue = paymentValue - charge; *paymentValue is the remainder of paymentValue;
charge= 0; *charge is zero now;
output; *output a row;
h.replace(); *replace paymentValue in the hash with the new value of paymentValue, minus charge;
end;
end; *end of iteration through hash - at this point, either charge = 0 or we have run out of payments with that ID;
if charge gt 0 then do;
value=-1*charge;
call missing(id_ofpayment);
output; *output a row for the charge, which is not paid;
end;
if last.id_client then do; *this is cleanup, checking to see if we have any leftover payments;
do while (rc=0); *iterate through the remaining hash;
do rc = hi.first() by 0 while (rc eq 0 and ID_client ne ID_client_hash);
rc = hi.next();
end;
if rc=0 then do;
call missing(id_commodity); *to make it clear this is a leftover payment;
value=paymentValue; *update the value;
output; *output the payment;
_id=ID_client_hash;
_pay=id_ofpayment;
rc = hi.next();
if rc= 0 then h.remove(key:_id,key:_pay); *remove the payment just output;
end;
end;
end;
end;
keep id_client id_ofpayment id_commodity value;
run;
除此之外,这并不是非常快 - 我做了很多可能会浪费的迭代。如果您没有任何未在收费记录中表示的付款 ID_client 记录,它将相对更快 - 您所做的任何事情都会被跳过,因此最终可能会非常慢。
我不相信 hash 是更好的解决方案,至少在 9.2 之前;键控更新可能更好。UPDATE 非常适用于事务数据库结构,这似乎很接近。