1

在下面的示例数据中,我想根据移植找到每个父项的值差异,然后除以该列中所有值的平均值。具体来说,移植列中外泻湖和内泻湖之间的父级 21 的 BM 差异除以所有 BM 值的平均值(BM @外泻湖 - BM @内泻湖)/平均值(BM)是多少?然后如何将其应用于最后 7 列(BM、BWx.d ...)中的每一列?

df <- structure(list(Parent = c(21L, 21L, 22L, 22L), Transplant = structure(c(1L, 2L, 1L, 
2L), .Label = c("Inner Lagoon", "Outer Lagoon"), class = "factor"), Origin = structure(c(2L, 
2L, 2L, 2L), .Label = c("Inner Lagoon", "Outer Lagoon"), class = "factor"), Timepoint = c(3, 
3, 3, 3), Species = structure(c(1L, 1L, 1L, 1L), .Label = c("MCAP", "PCOM"), class =. 
"factor"), BM = c(5.865888296, 7.181633357, 6.366555079, 6.413772163), BWx.d = 
c(0.539910592, 0.670790028, 0.60117695, 0.663487904), LE = c(0.009864166, 0.007034995, 
0.010088708, 0.008510985), GPSA = c(0.017825905, 0.037349997, 0.020185893, 0.033437065), RSA 
= c(0.005100527, 0.007212994, 0.005893039, 0.011174223), P_RSA = c(3.616330774, 5.516517387, 
3.590072155, 2.994321812), Survival = c(91.89189189, 100, 100, 97.2972973)), row.names = 
81:84, class = "data.frame")
4

1 回答 1

1

.df中的类定义中有一个额外的Species内容,这会导致问题。一旦删除,这工作正常。这是一个有趣的问题,因为您需要计算中交替行 (df$Transplant == "Outer Lagoon"df$Transplant == "Inner Lagoon") 和所有行 ( mean(BM)) 的值。所以简单的分组是Transplant行不通的。pivot_wider我的想法是使用fromtidyr使用Transplant列进行透视来创建一个宽数据框。这将为 的每个唯一值创建附加值列Transplant

library(dplyr)
library(tidyr)

meanBM <- mean(df$BM)

df <- df %>%
  pivot_wider(names_from = Transplant,
              values_from = c("BM", "BWx.d", "LE", "GPSA", "RSA", "P_RSA", "Survival")
              )

我们还需要计算BM所有行的平均值,因此我们需要在透视之前执行此操作。有了这个结果:

> glimpse(df)
Observations: 2
Variables: 18
$ Parent                  <int> 21, 22
$ Origin                  <fct> Outer Lagoon, Outer Lagoon
$ Timepoint               <dbl> 3, 3
$ Species                 <fct> MCAP, MCAP
$ `BM_Inner Lagoon`       <dbl> 5.865888, 6.366555
$ `BM_Outer Lagoon`       <dbl> 7.181633, 6.413772
$ `BWx.d_Inner Lagoon`    <dbl> 0.5399106, 0.6011770
$ `BWx.d_Outer Lagoon`    <dbl> 0.6707900, 0.6634879
$ `LE_Inner Lagoon`       <dbl> 0.009864166, 0.010088708
$ `LE_Outer Lagoon`       <dbl> 0.007034995, 0.008510985
$ `GPSA_Inner Lagoon`     <dbl> 0.01782590, 0.02018589
$ `GPSA_Outer Lagoon`     <dbl> 0.03735000, 0.03343707
$ `RSA_Inner Lagoon`      <dbl> 0.005100527, 0.005893039
$ `RSA_Outer Lagoon`      <dbl> 0.007212994, 0.011174223
$ `P_RSA_Inner Lagoon`    <dbl> 3.616331, 3.590072
$ `P_RSA_Outer Lagoon`    <dbl> 5.516517, 2.994322
$ `Survival_Inner Lagoon` <dbl> 91.89189, 100.00000
$ `Survival_Outer Lagoon` <dbl> 100.0000, 97.2973

随后的计算现在变得很容易,因为我们可以逐行进行:BMto的值Survival现在在同一行中。

df <- df %>%
  mutate(new_col = abs(`BM_Outer Lagoon` - `BM_Inner Lagoon`)/meanBM)

和:

> df$new_col
[1] 0.203771528 0.007312585

这与您计算的结果相同。您可以轻松地将其扩展为其他列。

于 2020-07-01T00:12:43.377 回答