还没有找到任何类似的东西,对朱莉娅来说是新的。
试图看看这是否可以在一个过程中完成,或者应该分开,或者我没有想到的其他事情。基本上如下面的 df 所示 - 试图看看我如何向这个 df 添加一个条件逻辑列,锚定在 Year 列上(为 int64 道歉 - 实际数据实际上是 Date df 格式)。
具体来说,为样本添加尾随 2 年列的最佳方法是什么(在显示整体增长正在进行的列旁边 - 在示例 df 中,它是 ProValue 列),类似于:
“ProValue2YrTrailing = cumprod(:Growth . +1) 当年数为每组 2 时"
无法完全弄清楚如何在此处使用@linq 和 Dataframes 通过转换创建条件列。
using DataFramesMeta
df = DataFrame(region = ["US","US","US","US","US","EU","EU","EU","EU","EU"],
product =
["apple","apple","apple","banana","banana","apple","apple","banana","banana","banana"],
year = [2009,2010,2011,2010,2011,2010,2011,2009,2010,2011],
Growth = [0.13,0.23,0.05,0.22,0.28,0.24,0.23,0.03,0.17,0.18])
df = @linq df |>
groupby([:region,:product]) |>
transform(ProValue = cumprod(:Growth .+1))
谢谢!
编辑:我能想到的一种方法是通过下面,但似乎不是很优雅,尤其是当周期帧从 2 增长到 30 时:
df = @linq df |>
groupby([:region,:product]) |>
transform(ProValueTrailing2 = ["missing"; rolling(prod, :Growth .+1, 2)])