0

嗨,我有如下数据:

在此处输入图像描述

共 38 列。治疗列中的 10 种治疗类型和日期列数据示例代码中从 25-29 的日期(例如 2 种治疗类型但数据有 10 种类型):

df <- structure(
    list(
      Christensenellaceae = c(
        0,
        0.009910731,
        0.010131195,
        0.009679938,
        0.01147601,
        0.010484508,
        0.008641566,
        0.010017172,
        0.010741488,
        0.1,
        0.2,
        0.3,
        0.4),
      Date=c(25,25,25,25,25,27,27,27,27,27,27,27,27),
      Treatment = c(
        "Original Sample",
        "Original Sample",
        "Original Sample",
        "Original Sample",
        "Original Sample"
        "Treatment 1",
        "Treatment 1",
        "Treatment 1",
        "Treatment 1",
        "Treatment 2",
        "Treatment 2",
        "Treatment 2",
        "Treatment  2")
    ),class = "data.frame",
    row.names = c(NA,-9L)
  )

我想做的是为每一列创建2个图,一个用于原始治疗,另一个用于此处的示例(1-2)中的所有治疗类型(1-10),并根据每种治疗类型添加观察的平均线. 治疗图总共应该有 10 条平均线(这里是 2 条)。遗憾的是,我不明白如何添加按治疗类型分组的行,这是我基于所有治疗类型的一行代码。如何添加按治疗类型分组的行:

df_3 %>% 
  pivot_longer(-treatment) %>% 
  mutate(plot = ifelse(str_detect(treatment, "Original"), 
                       "Original sample", 
                       "Treatment"),
         treatment = str_extract(treatment, "\\d+$")) %>% 
  group_by(name) %>% 
  group_split() %>% 
  map(~.x %>% ggplot(aes(x = factor(treatment), y = value, color = factor(name))) +
        geom_point() +
        stat_summary(aes(y = value,group=1), fun.y=mean, colour="red", geom="line",group=1)
        +
        facet_wrap(~plot, scales = "free_x") +
        labs(x = "Treatment", y = "Value", color = "Taxa") +
        guides(x =  guide_axis(angle = 90))+
        theme_bw()) 

在此处输入图像描述 如您所见,平均线只有一条,每种治疗类型都需要 10 条(此处为 2 条)。有什么方法可以编辑我的代码以便它可以工作吗?谢谢你:)

我也试过这段代码,但我似乎没有工作

      df %>% 
     pivot_longer(-c(Treatment, Date), names_to = "taxon") 
      %>% mutate( type = Treatment %>% str_detect("Original") 
      %>% ifelse("Original", "Treatment"), treatment_nr = Treatment 
       %>% str_extract("(?<=Treatment )[0-9]+") )
         %>% ggplot(aes(Date, value, color = treatment_nr)) + 
           geom_point() + stat_summary( geom = "point", fun.y = 
           "mean", size = 3, shape = 24 ) + geom_line() + facet_grid(type 
            ~ taxon, scales = "free_y") #> Warning: `fun.y` is deprecated. 
                Use `fun` instead. 
4

1 回答 1

0

您的数据格式不正确,并且与您的原始示例代码不匹配(例如 Treatment,而不是treatment)。无论如何,我将在这里生成一些数据,以便根据图像中的数据说明解决方案。

library(tidyverse)
set.seed(1)
df <-
  data.frame(
    Christensenellaceae = runif(105),
    treatment = rep(c("Original Sample_25", 
                      paste0("Treatment", 1:10, "_", 27), 
                      paste0("Treatment", 1:10, "_", 28)), 
                    each = 5)
  )

因为您将平均值生成为一条线,它将连接在 x 轴上。我已经做了一个非常懒惰的工作,使用一个段并在绘图之前计算平均值。根据十次治疗的效果,您可以通过更改来更改平均线的大小avg_line_length

因为该段有额外的 x 轴值(例如 0.65、1.35),所以 x 轴将默认包括这些额外的值。我已经创建了标签和中断来解决这个问题,并且我已经使用了中间数据labs_df。我把原文留空了。您也可以使用颜色/线型在图例中将线条显示为“平均值”。

avg_line_length <- 0.35

p <-
  df %>% 
    pivot_longer(-treatment) %>% 
    mutate(plot = ifelse(str_detect(treatment, "Original"), 
                         "Original sample", 
                         "Treatment"),
           treatment = as.numeric(str_extract(treatment, "\\d+")),
           treatment_label = ifelse(plot %in% "Original sample", "", treatment)) %>% 
    {. ->> lab_df} %>%
    group_by(treatment) %>%
    mutate(avg = mean(value),
           xstart = treatment - avg_line_length,
           xend = treatment + avg_line_length) %>%
    ungroup() %>%
    group_by(name) %>%
    group_split() %>% 
    map(~.x %>% ggplot() +
          geom_point(aes(x = treatment, y = value, color = name)) +
          geom_segment(aes(x = xstart, xend = xend, y = avg, yend = avg, color = name)) +
          scale_x_continuous(breaks = lab_df$treatment, labels = lab_df$treatment_label) +
          facet_wrap(~plot, scales = "free_x") +
          labs(x = "Treatment", y = "Value", color = "Taxa") +
          guides(x =  guide_axis(angle = 90))+
          theme_bw()) 

p
#> [[1]]

如果您不想要原始样本的平均线,只需额外的ifelse.

p2 <-
  df %>% 
    pivot_longer(-treatment) %>% 
    mutate(plot = ifelse(str_detect(treatment, "Original"), 
                         "Original sample", 
                         "Treatment"),
           treatment = as.numeric(str_extract(treatment, "\\d+")),
           treatment_label = ifelse(plot %in% "Original sample", "", treatment)) %>% 
    {. ->> lab_df} %>%
    group_by(treatment) %>%
    mutate(avg = ifelse(plot %in% "Original sample", NA, mean(value)),
           xstart = treatment - avg_line_length,
           xend = treatment + avg_line_length) %>%
    ungroup() %>%
    group_by(name) %>%
    group_split() %>% 
    map(~.x %>% ggplot() +
          geom_point(aes(x = treatment, y = value, color = factor(name))) +
          geom_segment(aes(x = xstart, xend = xend, y = avg, yend = avg), colour="red") +
          scale_x_continuous(breaks = lab_df$treatment, labels = lab_df$treatment_label) +
          facet_wrap(~plot, scales = "free_x") +
          labs(x = "Treatment", y = "Value", color = "Taxa") +
          guides(x =  guide_axis(angle = 90))+
          theme_bw()) 

p2
#> [[1]]
#> Warning: Removed 5 rows containing missing values (geom_segment).

这很混乱,但希望能解决您的问题。

于 2021-12-15T16:08:25.537 回答