-1

我有一个关于计算每个主题的平均值的问题。

我有一个数据框如下:

   subj entropy n_gambles trial response   rt
1     0    high         2     0   sample 4205
2     0    high         2     0   sample  676
3     0    high         2     0     skip    0
4     0    high         2     1   sample  883
5     0    high         2     1   sample  697
6     0    high         2     1     skip    0
7     0    high         2     2   sample 1493
8     0    high         2     2   sample  507
9     0    high         2     2     skip    0
10    0    high         2     3   sample 1016

我想计算出每个主题的抽样方法。

我已经把它工作到这里,但我不知道接下来是什么代码。

注意:每个主题的采样比例不同。

  subj trial n_gambles entropy response n_sample
2497    0     0         2    high   sample        2
2498    1     0         2    high   sample        0
2499    2     0         2    high   sample        0
2500    3     0         2    high   sample        0
2501    4     0         2    high   sample       27
2502    5     0         2    high   sample        0
2503    6     0         2    high   sample        0
2504    7     0         2    high   sample        0
2505    8     0         2    high   sample       19
2506    9     0         2    high   sample        0
2507   10     0         2    high   sample        0

以下是我到目前为止的代码。

rm(list=ls())

# Import 'sub.csv' data file into a dataframe
data_subj <- read.csv ('subj.csv')
head (data_subj)

# Import 'response.csv' data file into a dataframe
data_response <- read.csv ('response.csv')
head(data_response)

# Merge 'response' and 'trial'
data <- merge (data_subj, data_response, by='subj')
head(data)


data <- as.data.frame(table(data$subj, data$trial, data$n_gambles, data$entropy, data$response))
colnames(data) <- c('subj', 'trial', 'n_gambles', 'entropy', 'response', 'n_sample')

# Subset for "sample"
data <- data[ data$response == "sample",]
head(data)

有人可以帮我吗?

我希望输出看起来像这样:

subj trial n_gambles entropy response n_sample  mean_sample/trials
  0     0         2    high   sample        2             
  1     0         2    high   sample        0
  2     0         2    high   sample        0
  3     0         2    high   sample        0
4

1 回答 1

0

This is similar to the answer of your earlier question:

library(plyr)
ddply(df,.(subj),summarize,mymean=(length(which(response=="sample")))/6)
 subj   mymean
1    0 1.166667
于 2013-09-04T21:23:07.283 回答