r - 根据通用数据框值组合列表元素

Question

对这个问题的后续行动，即使这个例子是具体的，这似乎是一个通用的应用程序，所以我认为值得一个单独的线程：

一般的问题是：如何获取列表中与原始数据框中的值相对应的元素，并根据原始数据框中的值组合它们，尤其是当列表中的元素长度不同时？

在这个例子中，我有一个数据框，它有两组，每组按日期排序。我最终想要做的是得到一个按日期组织的数据框，其中只有每个细分的相关指标。如果某个段没有特定日期的数据，则它得到一个 0。

以下是一些实际数据：

structure(list(date = structure(c(15706, 15707, 15708, 15709, 
15710, 15706, 15707, 15708), class = "Date"), segment = structure(c(1L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c("abc", "xyz"), class = "factor"), 
    a = c(76L, 92L, 96L, 76L, 80L, 91L, 54L, 62L), x = c(964L, 
    505L, 968L, 564L, 725L, 929L, 748L, 932L), k = c(27L, 47L, 
    36L, 40L, 33L, 46L, 30L, 36L), value = c(6872L, 5993L, 5498L, 
    5287L, 6835L, 6622L, 5736L, 7218L)), .Names = c("date", "segment", 
"a", "x", "k", "value"), row.names = c(NA, -8L), class = "data.frame")

因此，对于“abc”部分，我只关心相对于其基准 75 的 (value/a)。对于“xyz”部分，我只关心相对于其基准 0.04 的 (k/x)。

最终我想要一个看起来像这样的数据框：

        date   abc   xyz
1 2013-01-01  0.21  0.24
2 2013-01-02 -0.13  0.00
3 2013-01-03 -0.24 -0.03
4 2013-01-04 -0.07  0.00
5 2013-01-05  0.14  0.00

其中，由于“xyz”只有 2013 年 1 月 1 日至 2013 年 1 月 3 日的信息，因此之后的所有内容都为 0。

我是如何做到这一点的：

定义要传递给mapply的参数

splits <- split(test, test$segment)
metrics <- c("ametric","xmetric")
benchmarks <- c(75,0.04)

以及根据基准获得性能的功能

performance <- function(splits,metrics,benchmarks){
    (splits[,metrics]/benchmarks)-1
}

将这些传递给 mapply：

temp <- mapply(performance, splits, metrics, benchmarks)

现在的问题是，由于分割的长度不同，输出看起来像这样：

summary(temp)

    Length Class  Mode   
abc 5      -none- numeric
xyz 3      -none- numeric

有没有办法从原始数据框中为每个拆分引入日期，并根据这些日期进行组合（在没有数据的情况下使用 0）？

score 0 · Accepted Answer

您只需将SIMPLIFY=FALSE参数设置为mapply，然后您可以使用do.callwithrbind将所有内容放回一个数据框：

> temp <- mapply(performance, splits, metrics, benchmarks)
> do.call('rbind',mapply(cbind, splits, performance=temp, SIMPLIFY=FALSE))
            date segment  a   x  k value  performance
abc.1 2013-01-01     abc 76 964 27  6872 1.333333e-02
abc.2 2013-01-02     abc 92 505 47  5993 2.266667e-01
abc.3 2013-01-03     abc 96 968 36  5498 2.800000e-01
abc.4 2013-01-04     abc 76 564 40  5287 1.333333e-02
abc.5 2013-01-05     abc 80 725 33  6835 6.666667e-02
xyz.6 2013-01-01     xyz 91 929 46  6622 2.322400e+04
xyz.7 2013-01-02     xyz 54 748 30  5736 1.869900e+04
xyz.8 2013-01-03     xyz 62 932 36  7218 2.329900e+04

r - 根据通用数据框值组合列表元素

1 回答 1

Related

Reference