r - 按样本数量加权最大丰度

Question

我有一个数据集，其中包含有关生物体丰度和发现它的沉积物泥含量百分比的数据。

我随后将泥浆含量数据划分为 10 个箱（即 0 - 10%、10.1 - 20% 等），并将丰度数据相应地放入每个箱中。

主要目的是在泥浆梯度（即 0 - 100%）上绘制每个泥浆箱中的最大丰度，但这些最大值由每个箱中的样本数加权。

所以，我的问题是如何通过每个垃圾箱中的样本数量来衡量给定泥箱中的最大丰度？

这是我的数据的一个简单子集：

Mud % bins: |     0 - 9      |     9.1 - 18      |     18.1 - 27    |
Abundance:   10,10,2,2,2,1,1      15,15,15,2      20,20,20,1,1,1,1,1

score 1 · Accepted Answer

您可以为此使用ddplyplyr 包。在以下代码中，wtdabundance 是您weighted abundance= (max of a bin*number of observation of that bin)/total observation 的示例数据，

mydata<-structure(list(id = 1:19, bin = structure(c(1L, 1L, 1L, 1L, 1L, 
1L, 1L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("0-9", 
"18.1-27", "9.1-18"), class = "factor"), abundance = c(10L, 10L, 
2L, 2L, 2L, 1L, 1L, 15L, 15L, 15L, 2L, 20L, 20L, 20L, 1L, 1L, 
1L, 1L, 1L)), .Names = c("id", "bin", "abundance"), class = "data.frame", row.names = c(NA, 
-19L))
> mydata
   id     bin abundance
1   1     0-9        10
2   2     0-9        10
3   3     0-9         2
4   4     0-9         2
5   5     0-9         2
6   6     0-9         1
7   7     0-9         1
8   8  9.1-18        15
9   9  9.1-18        15
10 10  9.1-18        15
11 11  9.1-18         2
12 12 18.1-27        20
13 13 18.1-27        20
14 14 18.1-27        20
15 15 18.1-27         1
16 16 18.1-27         1
17 17 18.1-27         1
18 18 18.1-27         1
19 19 18.1-27         1


 ddply(dat,.(bin), summarize, max.abundance=max(abundance), freq=length(bin),mwtdabundance=((max.abundance*freq/nrow(dat))))
      bin max.abundance freq mwtdabundance
1     0-9            10    7      3.684211
2 18.1-27            20    8      8.421053
3  9.1-18            15    4      3.157895

score 0 · Accepted Answer

aggregate解决方案：

如果您的数据如下所示：

dat <- data.frame(
  bin=rep(c("0-9","9.1-18","18.1-27"),c(7,4,8)),
  abundance=c(10,10,2,2,2,1,1,15,15,15,2,20,20,20,1,1,1,1,1)
)

       bin abundance
1      0-9        10
...
8   9.1-18        15
...
12 18.1-27        20

然后：

aggregate(abundance ~ bin,data=dat,FUN=function(x) max(x) * length(x)/nrow(dat))

      bin abundance
1     0-9  3.684211
2 18.1-27  8.421053
3  9.1-18  3.157895

r - 按样本数量加权最大丰度

2 回答 2

Related

Reference