1

I have a sample dataframe:

 data<-data.frame(a=c(1,2,3),b=c(4,5,5),c=c(6,8,7),d=c(8,9,10))

And wish to calculate the z-scores for every row in the data frame and did :

 scores<-apply(data,1,zscore)

I used the zscore function from

install.packages(c("R.basic"), contriburl="http://www.braju.com/R/repos/")

And obtained this

 row.names     V1            V2          V3
    a      -1.2558275   -1.2649111  -1.0883839
    b      -0.2511655   -0.3162278  -0.4186092
    c       0.4186092    0.6324555   0.2511655
    d       1.0883839    0.9486833   1.2558275

But when I try manually calculating the z score for the first row of the data frame I obtain the following values:

      -1.45 -0.29  0.4844, 1.25

Manually, for the first row, I calculated as follows:

1) calculate the row mean (4.75) for first row

2) Subtract each value from the row mean (e.g; 4.75-1., 4.75-4., 4.75-6., 4.75-8)

3) square each difference.

4) add them up and divide by the amount of samples in row 1

5) thus I obtain the variance( answer = 6.685) and then get the standard deviation ( 2.58) of the first row alone

6) Then apply the formula of z score.

4

1 回答 1

9

zscore无论功能是什么,似乎都与scale包中的功能相同base

apply(data, 1, scale)
##            [,1]       [,2]       [,3]
## [1,] -1.2558275 -1.2649111 -1.0883839
## [2,] -0.2511655 -0.3162278 -0.4186092
## [3,]  0.4186092  0.6324555  0.2511655
## [4,]  1.0883839  0.9486833  1.2558275

对于每一列,它都在计算(x - mean(x)) / sd(x).

于 2013-10-15T15:11:58.720 回答