我正在尝试使用 cuDF row_apply 根据其他行计算新列。
对于单行,它适用于以下脚本
filteredhlcdf.loc[(filteredhlcdf.ddate == %%Ddate%%) & (filteredhlcdf.sstart == %%sTime%%) & (filteredhlcdf.ttime <= %%etime%%) & (filteredhlcdf.H > %%ustep%%) , "ttime" ].min()
,其中 %% 之间的变量被常量替换。
但是,当我尝试按如下方式使用行应用时,会产生错误
crossappliedcdf = cudf.from_pandas(crossapplieddf)
filteredhlcdf = cudf.from_pandas(filteredhldf)
def rowcal(ddate: int, stime: int, etime: int, usteps: float, dsteps: float, ctime: int, creturn: float):
#def rowcal(ddate: float, stime: float, etime: float, usteps: float, dsteps: float, ctime: int, creturn: float, kwarg1: int):
for i, (tddate, tstime, tetime, tusteps, tdsteps) in enumerate(zip(ddate, stime, etime, usteps, dsteps)):
ctime[i] = filteredhlcdf.loc[(filteredhlcdf.ddate == tddate) & (filteredhlcdf.sstart == tstime) & (filteredhlcdf.ttime <= tetime) & (filteredhlcdf.H > tusteps) , "ttime" ].min()
creturn[i] = tddate + tstime + tetime+ tdsteps
crossappliedcdf.apply_rows(rowcal, incols=['ddate', 'stime', 'etime','usteps','dsteps'], outcols=dict(ctime=np.int32, creturn=float), kwargs={})
我确定错误发生在以下行
ctime[i] = filteredhlcdf.loc[(filteredhlcdf.ddate == tddate) & (filteredhlcdf.sstart == tstime) & (filteredhlcdf.ttime <= tetime) & (filteredhlcdf.H > tusteps) , "ttime" ].min()
因为错误在被替换后消失了ctime[i] = tddate + tstime + tetime+ tdsteps