I am trying to apply a certain function to groups of columns from a data frame based upon a 'design' vector containing the column indices that are part of the same experimental design 'group' (i.e. replicates). My observations are the rows, my sampling points are the columns.
The design vector designates which columns should group together:
designvector <- c(rep(1,2), rep(2,3), rep(3,3), rep(4,2), rep(5,2), rep(6,2),
rep(7,2), rep(8,2), rep(9,2))
A small example of the data frame to which I want to apply the function is:
structure(list(`1` = c(4381L, 608L, 7648L, 458L, 350L, 203L),
`1` = c(6450L, 1389L, 4896L, 526L, 920L, 352L), `2` = c(1966L,
59L, 492L, 5291L, 1401L, 133L), `2` = c(6338L, 281L, 2649L,
4718L, 1281L, 377L), `2` = c(12399L, 578L, 3094L, 1787L,
1180L, 541L), `3` = c(9629L, 554L, 7299L, 2819L, 1314L, 497L
), `3` = c(11329L, 709L, 3720L, 2909L, 1929L, 655L), `3` = c(11319L,
535L, 5212L, 2191L, 1239L, 633L), `4` = c(7427L, 8637L, 894L,
2L, 782L, 120L), `4` = c(6748L, 9139L, 431L, 28L, 871L, 224L
), `5` = c(7125L, 11819L, 1728L, 9L, 607L, 313L), `5` = c(8651L,
11022L, 442L, 96L, 728L, 249L), `6` = c(17879L, 3402L, 319L,
6L, 1226L, 489L), `6` = c(20859L, 2648L, 463L, 10L, 1189L,
408L), `7` = c(13457L, 1124L, 9386L, 18L, 635L, 367L), `7` = c(16292L,
1732L, 6552L, 20L, 1022L, 431L), `8` = c(9035L, 5887L, 185L,
11L, 550L, 1814L), `8` = c(14831L, 5833L, 570L, 8L, 1089L,
1462L), `9` = c(22023L, 2254L, 5212L, 63L, 555L, 1254L),
`9` = c(16887L, 2491L, 4949L, 68L, 921L, 983L)), .Names = c("1",
"1", "2", "2", "2", "3", "3", "3", "4", "4", "5", "5", "6", "6",
"7", "7", "8", "8", "9", "9"), row.names = c(NA, 6L), class = "data.frame")
However, using ddply
I get an error which I do not really understand:
ddply(abmat.sum,.(designvector),mean)
gives the following output:
designvector V1
1 1 NA
2 2 NA
3 3 NA
4 4 NA
5 5 NA
6 6 NA
7 7 NA
8 8 NA
9 9 NA
Warning messages:
1: In mean.default(piece, ...) :
argument is not numeric or logical: returning NA
2: In mean.default(piece, ...) :
argument is not numeric or logical: returning NA
3: In mean.default(piece, ...) :
argument is not numeric or logical: returning NA
4: In mean.default(piece, ...) :
argument is not numeric or logical: returning NA
5: In mean.default(piece, ...) :
argument is not numeric or logical: returning NA
6: In mean.default(piece, ...) :
argument is not numeric or logical: returning NA
7: In mean.default(piece, ...) :
argument is not numeric or logical: returning NA
8: In mean.default(piece, ...) :
argument is not numeric or logical: returning NA
9: In mean.default(piece, ...) :
argument is not numeric or logical: returning NA
I am clueless as to what I am doing wrong here. Any suggestions using ddply or other methods then for-looping over the dataframe are welcome.