我有一个数据集,可以测量来自多个样本站点的大型无脊椎动物的丰度。我希望将最近几年的抽样结果与同一地点所有前几年的抽样结果进行比较。
我的数据如下所示:
# A tibble: 6 x 5
basin sitecode sampleid metric value
<fct> <chr> <int> <chr> <dbl>
1 arctic coast islands HUSK1 13482 s_abundance1 5312
2 arctic coast islands HUSK1 13482 s_abundance2 NA
3 arctic coast islands NOEL1 13488 s_abundance1 616
4 arctic coast islands NOEL1 13488 s_abundance2 NA
5 arctic coast islands RPR070 6815 s_abundance1 NA
6 arctic coast islands RPR070 6815 s_abundance2 697
>
s_abundance1 代表最近站点的站点丰度,s_abundance2 代表先前采样站点的站点丰度
整个数据集大约有 4000 行,由许多不同流域的样本数据组成。
我想执行 mann-whitney u 测试,比较 s_abundance1 和 s_abundance2,但在单个输出中按盆地分组
我一直在使用的代码是:
abund_results %>%
+ group_by(basin) %>%
+ summarise(tidy(wilcox.test(abund_results$value ~ abund_results$metric, data = .)))
它似乎有效,只是所有的 p 值都完全相同。这是输出:
abund_results %>%
+ group_by(basin) %>%
+ summarise(tidy(wilcox.test(abund_results$value ~ abund_results$metric, data = .)))
`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 17 x 5
basin statistic p.value method alternative
<fct> <dbl> <dbl> <chr> <chr>
1 arctic coast islands 181204. 5.82e-108 Wilcoxon rank sum test with continuit… two.sided
2 columbia 181204. 5.82e-108 Wilcoxon rank sum test with continuit… two.sided
3 fraser lower mainla… 181204. 5.82e-108 Wilcoxon rank sum test with continuit… two.sided
4 great lakes 181204. 5.82e-108 Wilcoxon rank sum test with continuit… two.sided
5 lower mackenzie 181204. 5.82e-108 Wilcoxon rank sum test with continuit… two.sided
6 lower saskatchewan-… 181204. 5.82e-108 Wilcoxon rank sum test with continuit… two.sided
7 maritime coastal 181204. 5.82e-108 Wilcoxon rank sum test with continuit… two.sided
我需要更改哪些内容才能为每个盆地获得不同的结果?