8

我在 R 中创建了一个折线图(绘图),每个数据点上都有标签。由于大量数据点,绘图变得非常完整带有标签。我想仅对最后 N 个(比如 4 个)数据点应用标签。我在geom_label_repel函数中尝试了子集尾部,但无法识别它们或收到错误消息。我的数据集包含 99 个值,分布在 3 个组 (KPI) 中。

我在 R 中有以下代码:

library(ggplot)
library(ggrepel)

data.trend <- read.csv(file=....)

plot.line <- ggplot(data=data.trend, aes(x = Version, y = Value, group = KPI, color = KPI)) +

  geom_line(aes(group = KPI), size = 1) +
  geom_point(size = 2.5) +


  # Labels defined here
  geom_label_repel(
    aes(Version, Value, fill = factor(KPI), label = sprintf('%0.1f%%', Value)),
    box.padding = unit(0.35, "lines"),
    point.padding = unit(0.4, "lines"),
    segment.color = 'grey50',
    show.legend = FALSE
  )

);

公平地说,我对 R 很陌生。也许我错过了一些基本的东西。

提前致谢。

4

2 回答 2

9

简单的方法是将data =参数设置geom_label_repel为仅包含要标记的点。

这是一个可重现的示例:

set.seed(1235)
data.trend <- data.frame(Version = rnorm(25), Value = rnorm(25), 
                         group = sample(1:2,25,T), 
                         KPI = sample(1:2,25,T))

ggplot(data=data.trend, aes(x = Version, y = Value, group = KPI, color = KPI)) +
  geom_line(aes(group = KPI), size = 1) +
  geom_point(size = 2.5) +
  geom_label_repel(aes(Version, Value, fill = factor(KPI), label = sprintf('%0.1f%%', Value)),
    data = tail(data.trend, 4),                 
    box.padding = unit(0.35, "lines"),
    point.padding = unit(0.4, "lines"),
    segment.color = 'grey50',
    show.legend = FALSE)

在此处输入图像描述

不幸的是,这与排斥算法略有混淆,使得标签放置相对于其他未标记的点而言不是最佳的(您可以在上图中看到一些点被标签覆盖)。

因此,更好的方法是使用colorfill简单地使不需要的标签不可见(通过将颜色和填充设置NA为要隐藏的标签):

ggplot(data=data.trend, aes(x = Version, y = Value, group = KPI, color = KPI)) +
  geom_line(aes(group = KPI), size = 1) +
  geom_point(size = 2.5) +
  geom_label_repel(aes(Version, Value, fill = factor(KPI), label = sprintf('%0.1f%%', Value)),
                   box.padding = unit(0.35, "lines"),
                   point.padding = unit(0.4, "lines"),
                   show.legend = FALSE,
                   color = c(rep(NA,21), rep('grey50',4)),
                   fill = c(rep(NA,21), rep('lightblue',4)))

在此处输入图像描述

于 2017-01-07T15:04:34.710 回答
1

如果您只想显示最后一个标签,使用 group_by 和 filter 可能会起作用:

data = data.trend %>% group_by(KPI) %>% filter(Version == max(Version))

完整示例:

suppressPackageStartupMessages(library(dplyr))
library(ggplot2)
library(ggrepel)

set.seed(1235)
data.trend <- data.frame(Version = rnorm(25), Value = rnorm(25), 
                         group = sample(1:2,25,T), 
                         KPI = sample(1:2,25,T))

ggplot(data = data.trend, aes(x = Version, y = Value, group = KPI, color = KPI)) +
  geom_line(aes(group = KPI), size = 1) +
  geom_point(size = 2.5) +

  # Labels defined here
  geom_label_repel(
    data = data.trend %>% group_by(KPI) %>% filter(Version == max(Version)), 
    aes(Version, Value, fill = factor(KPI), label = sprintf('%0.1f%%', Value)),
    color = "black",
    fill = "white")

或者,如果您想为每个 KPI 显示 4 个随机标签,则data.trend %>% group_by(KPI) %>% sample_n(4)

suppressPackageStartupMessages(library(dplyr))
library(ggplot2)
library(ggrepel)

set.seed(1235)
data.trend <- data.frame(Version = rnorm(25), Value = rnorm(25), 
                         group = sample(1:2,25,T), 
                         KPI = as.factor(sample(1:2,25,T)))

ggplot(data = data.trend, aes(x = Version, y = Value, group = KPI, color = KPI)) +
  geom_line(aes(group = KPI), size = 1) +
  geom_point(size = 2.5) +
  
  # Labels defined here
  geom_label_repel(
    data = data.trend %>% group_by(KPI) %>% sample_n(4), 
    aes(Version, Value, fill = factor(KPI), label = sprintf('%0.1f%%', Value), fill = KPI),
    color = "black", show.legend = FALSE
    )
#> Warning: Duplicated aesthetics after name standardisation: fill

reprex 包于 2021-08-27 创建(v2.0.1)

于 2021-08-27T15:33:44.830 回答