2

我有一个 geom_smooth,它有一个 x 轴日期、y 轴 COVID 案例,然后是两个类别。我正在尝试绘制最大峰值。

# Reproducible data
library(tidyverse)
df <- tribble(~date, ~cases, ~category,
              "2021/1/1", 100, "A",
              "2021/1/1", 103, "B",
              "2021/1/2", 108, "A",
              "2021/1/2", 109, "B",
              "2021/1/3", 102, "A",
              "2021/1/3", 120, "B",
              "2021/1/4", 150, "A",
              "2021/1/4", 160, "B",
              "2021/1/5", 120, "A",
              "2021/1/5", 110, "B",
              "2021/1/6", 115, "A",
              "2021/1/6", 105, "B",)

# Plotting geom_smooth
df %>%
  ggplot(df, mapping = aes(date, cases, group = category, color = category)) +
  geom_smooth()

如何将最大峰值添加到 geom_smooth?理想情况下,我想要一个点和一个说明峰值情况的文本。

我尝试在 ggplot 代码之外找到峰值 - 但它返回一个不同的峰值,因为 geom_smooth 正在创建自己的函数,而不仅仅是该类别的平均值。

下面的响应有效,但我想移动标签以使其更清晰,但 geom_text_repel 似乎只指第一条曲线而不是两者。有什么建议吗?

library(ggplot2)
library(tidyverse)
library(ggrepel)

# Fake data
ar =hist(rnorm(10000,1), breaks = 180, plot=F)$counts
br =hist(rnorm(11000,1), breaks = 180, plot=F)$counts

df <-  rbind(
  tibble(category="B", date = seq(as.Date("2021-01-01"),by=1, length.out=length(br)),value=br),
  tibble(category="A", date = seq(as.Date("2021-01-01"),by=1, length.out=length(ar)),value=ar)
)
# create the smooth and retain rows with max of smooth, using slice_max
sm_max = df %>% group_by(category) %>%
  mutate(smooth =predict(loess(value~as.numeric(date), span=.5))) %>% 
  slice_max(order_by = smooth)

# Plot, using the same smooth as above (default is loess, span set at set above)
df %>%
  ggplot(df, mapping = aes(date, value, group = category, color = category)) +
  geom_point() +
  geom_smooth(span=.5, se=F) + 
  geom_point(data=sm_max, aes(y=smooth),color="black", size=5) + 
  geom_text_repel(data = sm_max, aes(label=paste0("Peak: ",round(smooth,1))), color="black")

geom_text_repel(data = sm_max_p3, aes(x = date,
                                      y = smooth,
                                      label = paste0(candidate, " Peak: ",round(smooth,1))

在此处输入图像描述

4

2 回答 2

1

您需要先生成平滑,并确定最大值。然后你可以

  1. 将数据、平滑和最大值绘制在一起,或
  2. 绘制数据和最大值,然后再次使用geom_smooth()调用,确保在 geom_smooth 中使用与生成和识别最大值时相同的平滑。

这是一个示例,它使用这两个选项中的后者

# Fake data
ar =hist(rnorm(10000,1), breaks = 180, plot=F)$counts
br =hist(rnorm(25000,1), breaks = 180, plot=F)$counts

df = rbind(
  tibble(category="B", date = seq(as.Date("2021-01-01"),by=1, length.out=length(br)),value=br),
  tibble(category="A", date = seq(as.Date("2021-01-01"),by=1, length.out=length(ar)),value=ar)
)
# create the smooth and retain rows with max of smooth, using slice_max
sm_max = df %>% group_by(category) %>%
  mutate(smooth =predict(loess(value~as.numeric(date), span=.5))) %>% 
  slice_max(order_by = smooth)
  
# Plot, using the same smooth as above (default is loess, span set at set above)
df %>%
  ggplot(df, mapping = aes(date, value, group = category, color = category)) +
  geom_point() +
  geom_smooth(span=.5, se=F) + 
  geom_point(data=sm_max, aes(y=smooth),color="black", size=5) + 
  geom_text(data = sm_max, aes(y=smooth, label=paste0("Peak: ",round(smooth,1))), color="black")

peak_smooth

于 2022-02-17T18:08:12.873 回答
1

如果您只是想标记最大测量值,您可以使用{gghighlight}仅显示和标记平滑曲线顶部的那个点。你date也是一个character所以它是一个离散变量。因此,您geom_smooth()只是一条点对点的线。在这里,我将其转换为连续变量mutate(date = lubridate::ymd(date))

library(tidyverse)
library(lubridate)
library(gghighlight)

df <- tribble(~date, ~cases, ~category,
              "2021/1/1", 100, "A",
              "2021/1/1", 103, "B",
              "2021/1/2", 108, "A",
              "2021/1/2", 109, "B",
              "2021/1/3", 102, "A",
              "2021/1/3", 120, "B",
              "2021/1/4", 150, "A",
              "2021/1/4", 160, "B",
              "2021/1/5", 120, "A",
              "2021/1/5", 110, "B",
              "2021/1/6", 115, "A",
              "2021/1/6", 105, "B",)

# Plotting geom_smooth
df %>%
  mutate(date = ymd(date)) %>%
  group_by(category) %>%
  mutate(is_max = cases == max(cases)) %>% 
  ggplot(df, mapping = aes(date, cases, color = category)) +
  geom_smooth() +
  geom_point(size = 3) +
  gghighlight(is_max,
              n = 1,
              unhighlighted_params = list(alpha = 0),
              label_key = cases)

reprex 包于 2022-02-17 创建(v2.0.1)

于 2022-02-17T18:27:47.483 回答