r - Apply function based upon column index to data frame

Question

I am trying to apply a certain function to groups of columns from a data frame based upon a 'design' vector containing the column indices that are part of the same experimental design 'group' (i.e. replicates). My observations are the rows, my sampling points are the columns.
The design vector designates which columns should group together:

designvector <- c(rep(1,2), rep(2,3), rep(3,3), rep(4,2), rep(5,2), rep(6,2), 
                       rep(7,2), rep(8,2), rep(9,2))

A small example of the data frame to which I want to apply the function is:

structure(list(`1` = c(4381L, 608L, 7648L, 458L, 350L, 203L), 
`1` = c(6450L, 1389L, 4896L, 526L, 920L, 352L), `2` = c(1966L, 
59L, 492L, 5291L, 1401L, 133L), `2` = c(6338L, 281L, 2649L, 
4718L, 1281L, 377L), `2` = c(12399L, 578L, 3094L, 1787L, 
1180L, 541L), `3` = c(9629L, 554L, 7299L, 2819L, 1314L, 497L
), `3` = c(11329L, 709L, 3720L, 2909L, 1929L, 655L), `3` = c(11319L, 
535L, 5212L, 2191L, 1239L, 633L), `4` = c(7427L, 8637L, 894L, 
2L, 782L, 120L), `4` = c(6748L, 9139L, 431L, 28L, 871L, 224L
), `5` = c(7125L, 11819L, 1728L, 9L, 607L, 313L), `5` = c(8651L, 
11022L, 442L, 96L, 728L, 249L), `6` = c(17879L, 3402L, 319L, 
6L, 1226L, 489L), `6` = c(20859L, 2648L, 463L, 10L, 1189L, 
408L), `7` = c(13457L, 1124L, 9386L, 18L, 635L, 367L), `7` = c(16292L, 
1732L, 6552L, 20L, 1022L, 431L), `8` = c(9035L, 5887L, 185L, 
11L, 550L, 1814L), `8` = c(14831L, 5833L, 570L, 8L, 1089L, 
1462L), `9` = c(22023L, 2254L, 5212L, 63L, 555L, 1254L), 
`9` = c(16887L, 2491L, 4949L, 68L, 921L, 983L)), .Names = c("1", 
"1", "2", "2", "2", "3", "3", "3", "4", "4", "5", "5", "6", "6", 
"7", "7", "8", "8", "9", "9"), row.names = c(NA, 6L), class = "data.frame")

However, using ddply I get an error which I do not really understand: ddply(abmat.sum,.(designvector),mean) gives the following output:

designvector V1
1            1 NA
2            2 NA
3            3 NA
4            4 NA
5            5 NA
6            6 NA
7            7 NA
8            8 NA
9            9 NA
Warning messages:
1: In mean.default(piece, ...) :
  argument is not numeric or logical: returning NA
2: In mean.default(piece, ...) :
  argument is not numeric or logical: returning NA
3: In mean.default(piece, ...) :
  argument is not numeric or logical: returning NA
4: In mean.default(piece, ...) :
  argument is not numeric or logical: returning NA
5: In mean.default(piece, ...) :
  argument is not numeric or logical: returning NA
6: In mean.default(piece, ...) :
  argument is not numeric or logical: returning NA
7: In mean.default(piece, ...) :
  argument is not numeric or logical: returning NA
8: In mean.default(piece, ...) :
  argument is not numeric or logical: returning NA
9: In mean.default(piece, ...) :
  argument is not numeric or logical: returning NA

I am clueless as to what I am doing wrong here. Any suggestions using ddply or other methods then for-looping over the dataframe are welcome.

score 1 · Accepted Answer

问题是它abmat.sum的形式错误（它是“宽”而不是“长”，正如所要求的那样ddply）。用来melt解决这个问题。

library(reshape2)
abmat.sum_long <- melt(abmat.sum)
abmat.sum_long$variable <- as.numeric(abmat.sum_long$variable)

您还需要传递summarise给ddply.

library(plyr)
ddply(abmat.sum_long, .(variable), summarise, mean_value = mean(value))

r - Apply function based upon column index to data frame

1 回答 1

Related

Reference