r - daply：结果正确，但结构混乱

Question

我有一个 data.frame mydf，其中包含来自 27 个主题的数据。有两个预测变量，congruent（2 个级别）和offset（5 个级别），所以总共有 10 个条件。27 名受试者中的每名受试者在每种条件下测试 20 次，总共有 10*27*20 = 5400 次观察。RT是响应变量。结构如下所示：

> str(mydf)
'data.frame':   5400 obs. of  4 variables:
 $ subject  : Factor w/ 27 levels "1","2","3","5",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ congruent: logi  TRUE FALSE FALSE TRUE FALSE TRUE ...
 $ offset   : Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 5 5 1 2 5 5 2 2 3 5 ...
 $ RT       : int  330 343 457 436 302 311 595 330 338 374 ...

我曾经在 10 种条件下分别daply()计算mean每个受试者的 RT：

myarray <- daply(mydf, .(subject, congruent, offset), summarize, mean = mean(RT))

结果看起来正是我想要的方式，即 3d 数组；可以这么说 5 个表格（每个offset条件一个），显示每个主题在congruent=FALSEvs.congruent=TRUE条件下的平均值。

但是，如果我检查的结构myarray，我会得到一个令人困惑的输出：

List of 270
 $ : num 417
 $ : num 393
 $ : num 364
 $ : num 399
 $ : num 374
 ... 
 # and so on
 ...
 [list output truncated]
 - attr(*, "dim")= int [1:3] 27 2 5
 - attr(*, "dimnames")=List of 3
  ..$ subject  : chr [1:27] "1" "2" "3" "5" ...
  ..$ congruent: chr [1:2] "FALSE" "TRUE"
  ..$ offset   : chr [1:5] "1" "2" "3" "4" ...

ozone这看起来与包中的原型数组的结构完全不同plyr，尽管它的格式非常相似（3 维，只有数值）。

我想通过aaply. 准确地说，我想计算每个主题和偏移量的全等和非全等均值之间的差异。

aaply() 然而， like 的最基本应用已经aaply(myarray,2,mean) 返回了无意义的输出：

FALSE  TRUE 
   NA    NA 
Warning messages:
1: In mean.default(piece, ...) :
  argument is not numeric or logical: returning NA
2: In mean.default(piece, ...) :
  argument is not numeric or logical: returning NA

我不知道为什么该daply()函数会返回如此奇怪的结构化输出，从而阻止进一步使用aaply. 任何形式的帮助都将不胜感激，我坦率地承认我几乎没有使用该plyr软件包的经验。

score 1 · Accepted Answer

由于您没有包含您的数据，因此很难确定，但我试图在您的str(). 您可以使用ddply. 首先是手段，然后是手段的差异。

#Make dummy data
mydf <- data.frame(subject = rep(1:5, each = 150), 
  congruent = rep(c(TRUE, FALSE), each = 75), 
  offset = rep(1:5, each = 15), RT = sample(300:500, 750, replace = T))

#Make means
mydf.mean <- ddply(mydf, .(subject, congruent, offset), summarise, mean.RT = mean(RT))

#Calculate difference between congruent and incongruent
mydf.diff <- ddply(mydf.mean, .(subject, offset), summarise, diff.mean = diff(mean.RT))
head(mydf.diff)
#   subject offset  diff.mean
# 1       1      1  39.133333
# 2       1      2   9.200000
# 3       1      3  20.933333
# 4       1      4  -1.533333
# 5       1      5 -34.266667
# 6       2      1  -2.800000

r - daply：结果正确，但结构混乱

1 回答 1

Related

Reference