r - 一次重塑多个值

Question

我有一个很长的数据集，我想扩大范围，我很好奇是否有一种方法可以使用 R 中的 reshape2 或 tidyr 包一步完成。

数据框df如下所示：

id  type    transactions    amount
20  income       20          100
20  expense      25          95
30  income       50          300
30  expense      45          250

我想解决这个问题：

id  income_transactions expense_transactions    income_amount   expense_amount
20       20                           25                 100             95
30       50                           45                 300             250

我知道我可以通过例如 reshape2 获得其中的一部分：

dcast(df, id ~  type, value.var="transactions")

但是有没有办法一次性重塑整个df，同时解决“交易”和“金额”变量？理想情况下，使用新的更合适的列名？

score 30 · Accepted Answer

在“reshape2”中，您可以使用recast（尽管根据我的经验，这不是一个广为人知的功能）。

library(reshape2)
recast(mydf, id ~ variable + type, id.var = c("id", "type"))
#   id transactions_expense transactions_income amount_expense amount_income
# 1 20                   25                  20             95           100
# 2 30                   45                  50            250           300

您还可以使用基本 R 的reshape：

reshape(mydf, direction = "wide", idvar = "id", timevar = "type")
#   id transactions.income amount.income transactions.expense amount.expense
# 1 20                  20           100                   25             95
# 3 30                  50           300                   45            250

或者，您可以melt和dcast，像这样（这里使用“data.table”）：

library(data.table)
library(reshape2)
dcast.data.table(melt(as.data.table(mydf), id.vars = c("id", "type")), 
                 id ~ variable + type, value.var = "value")
#    id transactions_expense transactions_income amount_expense amount_income
# 1: 20                   25                  20             95           100
# 2: 30                   45                  50            250           300

dcast.data.table在“data.table”（1.9.8）的更高版本中，您将能够直接执行此操作。如果我理解正确，@Arun 试图实现的将是在无需首先melt处理数据的情况下进行整形，这就是目前发生的情况recast，它本质上是一个melt+dcast操作序列的包装器。

而且，为了彻底，这里的tidyr方法是：

library(dplyr)
library(tidyr)
mydf %>% 
  gather(var, val, transactions:amount) %>% 
  unite(var2, type, var) %>% 
  spread(var2, val)
#   id expense_amount expense_transactions income_amount income_transactions
# 1 20             95                   25           100                  20
# 2 30            250                   45           300                  50

score 6 · Accepted Answer

使用 data.table v1.9.6+，我们可以value.var同时转换多个列（并且还可以在中使用多个聚合函数fun.aggregate）。请参阅?dcast更多信息以及示例部分。

require(data.table) # v1.9.6+
dcast(dt, id ~ type, value.var=names(dt)[3:4])
#    id transactions_expense transactions_income amount_expense amount_income
# 1: 20                   25                  20             95           100
# 2: 30                   45                  50            250           300

r - 一次重塑多个值

2 回答 2

Related

Reference