1

I have data frame with columns of characters, let's say tdf <- data.frame(words=letters[1:4], words2=letters[5:8], word3=letters[9:12])

I have also a corresponding vector stating the last column number to be used for combining the words in each rows, let's say tcol <- c(3, 1, 1, 2)

So for example for the fourth row, the output should be "d h".

I wrote a function that can handle the merging of each row

xyp <- function(x, y) do.call(paste, as.list(x[1:y]))

which works as expected with a for loop

> y <- character(0)
> for (x in 1:nrow(tdf)) y <- c(y, xyp(tdf[x, ], tcol[x]))
> y
[1] "a e i" "b"     "c"     "d h"  

I'd like to apply the function across the data frame without using for loop, but the function above doesn't seem to work for this purpose.

> mapply(xyp, tdf, tcol)
  words  words2   word3    <NA> 
"a b c"     "e"     "i"   "a b" 
Warning message:
In mapply(xyp, tdf, tcol) :
  longer argument not a multiple of length of shorter

I think I understand the error, but am not sure what I can do to fix this. Any suggestions?

4

1 回答 1

1

怎么样

mapply(function(x, i) paste(x[1:i], collapse=" "), 
    split(as.matrix(tdf),row(tdf)), 
    tcol)

在这里,我们使用split()将 data.frame 切片为行列表,而不是通常使用 data.frame 的情况下的列列表。

于 2015-01-19T03:53:25.587 回答