0

输入 df:

user attr val       date
100    a  10      2012-11-09
100    b  20      2012-11-08
101    a  11      2012-11-09

输出df:

user attr_a val_a date_a     attr_b  val_b date_b
100    a  10      2012-11-09    b     20      2012-11-08 
101    a  11      2012-11-09

在 R 中需要帮助将输入数据帧重塑为所需的输出数据帧。

4

2 回答 2

2

这是一个简短的函数,它
根据具有指定值 (byVal) 的给定列 (byCol)拼接您的数据框

spliceDF <- function(df, byVal, byCol="attr", preserveField="user")  {
# returns spliced df with renamed columns

  # identify which rows will be returned
  rows <- df[byCol]==byVal

  # append the suffix
  nm <- names(df)!=preserveField
  names(df)[nm] <- 
     paste(names(df)[nm], byVal, sep="_")

  return(df[rows,])
}

然后可以在merge中调用如下

# merge the two spliced data frames
merge(spliceDF(mydf, "a"), spliceDF(mydf, "b"), by="user", all=TRUE)

为清楚起见,最后一行可以分为三个单独的行

# Splice the df into two separate dfs
df_a <- spliceDF(mydf, byVal="a", byCol="attr")
df_b <- spliceDF(mydf, byVal="b", byCol="attr")

# mrege the two into one
merge(df_a, df_b, by="user", all=TRUE)

上面示例的代码

# build the data frame from your example
mydf <- data.frame(user=c(100,100,101), 
                   attr=c("a","b","a"), 
                   val =c(10, 20, 11), 
                   date=c(2012-11-09,2012-11-08,2012-11-09)
                  )

更新:

看着?merge(),它有一个后缀参数。
尝试 suffixes=c("_a", "_b") 效果很好。

    merge(df[df$attr=="a", ], df[df$attr=="b", ],
           by="user", suffixes=c("_a", "_b"), all=TRUE)

# OUTPUT
  user attr_a val_a date_a attr_b val_b date_b
1  100      a    10   1992      b    20   1993
2  101      a    11   1992   <NA>    NA     NA
于 2012-11-10T01:24:40.330 回答
0

尝试

merge( df[ df$attr == "a", ], df[ df$attr == "b", ], by= "user" )
于 2012-11-09T19:21:08.150 回答