3

我遇到了多数组平均的问题,例如,我有一个三维 4*4*3 数组x

x
 , , 1

     [,1] [,2] [,3] [,4]
[1,]   NA   NA   NA   NA
[2,]  0.5   NA   NA   NA
[3,]   NA   NA   NA   NA
[4,]   NA   NA   NA   NA

, , 2

     [,1] [,2] [,3] [,4]
[1,]   NA   NA   NA   NA
[2,]  0.7   NA   NA   NA
[3,]  0.4   NA   NA   NA
[4,]   NA   NA   NA   NA

, , 3

     [,1] [,2] [,3] [,4]
[1,]   NA   NA  0.8   NA
[2,]   NA   NA   NA   NA
[3,]   NA   NA   NA   NA
[4,]   NA   NA   NA   NA

我想要得到的是没有 NA 的总和,它是非 NA 元素数量的平均值:

基本上,结果是这样的

     [,1] [,2] [,3] [,4]

[1,]   0   0  0.8   0
[2,]   0.6  0  0   0
[3,]   0.4  0   0   0
[4,]   0   0   0   0

在matlab中我是这样做nansum(x, 3)./sum(~isnan(x), 3)的,我在R中尝试了很多,比如apply(x, 3, sum, na.rm = T)or Reduce,尝试先得到初步结果

     [,1] [,2] [,3] [,4]

[1,]     0   0  0.8   0
[2,]   1.2   0    0   0
[3,]   0.4   0    0   0
[4,]     0   0    0   0

但是我还是没搞定,有没有人打的?

4

3 回答 3

4

apply使用and ,您走在正确的轨道上na.rm=TRUE。您只需使用参数指定要应用的多个维度MARGIN=c(..., ...)

这是使用内置数据集的示例Titanic

str(Titanic)
 table [1:4, 1:2, 1:2, 1:2] 0 0 35 0 0 0 17 0 118 154 ...
 - attr(*, "dimnames")=List of 4
  ..$ Class   : chr [1:4] "1st" "2nd" "3rd" "Crew"
  ..$ Sex     : chr [1:2] "Male" "Female"
  ..$ Age     : chr [1:2] "Child" "Adult"
  ..$ Survived: chr [1:2] "No" "Yes"

现在对第 3 维和第 4 维求和:

apply(Titanic, c(3, 4), sum, na.rm=TRUE)
       Survived
Age       No Yes
  Child   52  57
  Adult 1438 654
于 2012-07-23T08:36:36.847 回答
3

也许是这样的:

apply(x, c(1,2), sum, na.rm=TRUE)

请注意,这是未经测试的,因为缺乏可重复的数据集。

于 2012-07-23T08:32:51.710 回答
3

也许这可能有用

 # Creating your array, I know this is an ugly way to do it :D
 Array <- array(rep(NA, 16*3), dim=c(4,4,3))
 Array[2,1,1] <- 0.5
 Array[2:3,1,2] <- c(0.7,0.4)
 Array[1,3,3] <-0.8
 Array # this is your array, (Array is not is a very original name)
, , 1

     [,1] [,2] [,3] [,4]
[1,]   NA   NA   NA   NA
[2,]  0.5   NA   NA   NA
[3,]   NA   NA   NA   NA
[4,]   NA   NA   NA   NA

, , 2

     [,1] [,2] [,3] [,4]
[1,]   NA   NA   NA   NA
[2,]  0.7   NA   NA   NA
[3,]  0.4   NA   NA   NA
[4,]   NA   NA   NA   NA

, , 3

     [,1] [,2] [,3] [,4]
[1,]   NA   NA  0.8   NA
[2,]   NA   NA   NA   NA
[3,]   NA   NA   NA   NA
[4,]   NA   NA   NA   NA


 # one way to get what you want could be...
 (result <- apply(Array, c(1,2), mean, na.rm=TRUE))
     [,1] [,2] [,3] [,4]
[1,]  NaN  NaN  0.8  NaN
[2,]  0.6  NaN  NaN  NaN
[3,]  0.4  NaN  NaN  NaN
[4,]  NaN  NaN  NaN  NaN

 # if you want zeroes instead of NaN as your desired output example shows...
 result[is.nan(result)] <- 0

 result
     [,1] [,2] [,3] [,4]
[1,]  0.0    0  0.8    0
[2,]  0.6    0  0.0    0
[3,]  0.4    0  0.0    0
[4,]  0.0    0  0.0    0
于 2012-07-23T11:16:19.017 回答