0

这是数据。

set.seed(23) data<-data.frame(ID=rep(1:12), group=rep(1:3,times=4), value=(rnorm(12,mean=0.5, sd=0.3)))

   ID group     value
1   1     1 0.4133934
2   2     2 0.6444651
3   3     3 0.1350871
4   4     1 0.5924411
5   5     2 0.3439465
6   6     3 0.3673059
7   7     1 0.3202062
8   8     2 0.8883733
9   9     3 0.7506174
10 10     1 0.3301955
11 11     2 0.7365258
12 12     3 0.1502212

我想在每个组中获得 z 标准化分数。所以我尝试

library(weights)
data_split<-split(data, data$group) #split the dataframe
stan<-lapply(data_split, function(x) stdz(x$value)) #compute z-scores within group

但是,它看起来不对,因为我想在“值”之后添加一个新变量我该怎么做?请提供一些建议(示例代码)。任何帮助是极大的赞赏 。

4

4 回答 4

1

I tried Ferdinand.Kraft's solution but it didn't work for me. I think the stdz function isn't included in the basic R install. Moreover, the within part troubled me in a large dataset with many variables. I think the easiest way is:

data$value.s <- ave(data$value, data$group, FUN=scale)
于 2013-09-23T08:28:26.433 回答
1

改用这个:

within(data, stan <- ave(value, group, FUN=stdz))

不用打电话split也不用lapply

于 2013-09-15T02:01:55.717 回答
1

使用 data.table 包的一种方法:

library(data.table)
library(weights)

set.seed(23)
data <- data.table(ID=rep(1:12), group=rep(1:3,times=4), value=(rnorm(12,mean=0.5, sd=0.3)))
setkey(data, ID)
dataNew <- data[, list(ID, stan = stdz(value)), by = 'group']

结果是:

    group ID       stan
 1:     1  1 -0.6159312
 2:     1  4  0.9538398
 3:     1  7 -1.0782747
 4:     1 10  0.7403661
 5:     2  2 -1.2683237
 6:     2  5  0.7839781
 7:     2  8  0.8163844
 8:     2 11 -0.3320388
 9:     3  3  0.6698418
10:     3  6  0.8674548
11:     3  9 -0.2131335
12:     3 12 -1.3241632
于 2013-09-15T02:28:30.187 回答
0

在函数中添加新列,并让函数返回整个数据框。

stanL<-lapply(data_split, function(x) {
x$stan <- stdz(x$value)
x
})

stan <- do.call(rbind, stanL)
于 2013-09-15T02:04:16.663 回答