r - ggplot2 - 天平在哪里建造？

Question

我想看看因子值在哪里变成数字值。我试图通过简单地print在任何地方添加语句来实现这一点......

geom_tile2 <- function(mapping = NULL, data = NULL,
                      stat = "identity2", position = "identity",
                      ...,
                      na.rm = FALSE,
                      show.legend = NA,
                      inherit.aes = TRUE) {
  layer(
    data = data,
    mapping = mapping,
    stat = stat,
    geom = GeomTile2,
    position = position,
    show.legend = show.legend,
    inherit.aes = inherit.aes,
    params = list(
      na.rm = na.rm,
      ...
    )
  )
}

GeomTile2 <- ggproto("GeomTile2", GeomRect,
  extra_params = c("na.rm", "width", "height"),

  setup_data = function(data, params) {
    print(data)

    data$width <- data$width %||% params$width %||% resolution(data$x, FALSE)
    data$height <- data$height %||% params$height %||% resolution(data$y, FALSE)

    transform(data,
              xmin = x - width / 2,  xmax = x + width / 2,  width = NULL,
              ymin = y - height / 2, ymax = y + height / 2, height = NULL
    )
  },

  default_aes = aes(fill = "grey20", colour = NA, size = 0.1, linetype = 1,
                    alpha = NA),

  required_aes = c("x", "y"),

  draw_key = draw_key_polygon
)

和

stat_identity2 <- function(mapping = NULL, data = NULL,
                          geom = "point", position = "identity",
                          ...,
                          show.legend = NA,
                          inherit.aes = TRUE) {
  layer(
    data = data,
    mapping = mapping,
    stat = StatIdentity2,
    geom = geom,
    position = position,
    show.legend = show.legend,
    inherit.aes = inherit.aes,
    params = list(
      na.rm = FALSE,
      ...
    )
  )
}

StatIdentity2 <- ggproto("StatIdentity2", Stat,

  setup_data = function(data, params) {
    print(data)
    data
  },
  compute_layer = function(data, scales, params) {
    print(data)
    print("stat end")
    data
  }
)

但是当我跑步时

ggplot(data.frame(x = rep(c("y", "n"), 6), y = rep(c("y", "n"), each = 6)), 
       aes(x = x, y = y)) + 
  geom_tile2()

x和y是setup_data函数中的数字stat。查看包的 Github 存储库，我似乎无法找到这种转换为坐标的实际发生位置？

score 4 · Accepted Answer

TL;博士

x / y 从因子到数值比例的转换由ggplot2:::Layout$map_position()函数完成，当前代码在这里：layout.r

长解释

我通常认为使用包创建绘图的步骤ggplot2分为两个阶段：

地块建设。这是当一个新的 ggplot 对象（通过初始化）ggplot()和添加到它的所有////geom_*层组合成一个单独的 ggplot 对象。如果我们写类似的东西，我们就停在这里。GH 代码：plot-construction.rstat_*facet_*scale_*coord_*p <- ggplot(mpg, aes(class)) + geom_bar()
绘图渲染。这是将组合的 ggplot 对象转换为可以渲染的对象（通过ggplot_build()）并进一步转换为 grobs 的 gtable（通过ggplot_gtable()）的时候。这通常通过 ggplot 对象的 print / plot 方法触发（参见此处），但我们也可以使用ggplotGrob()，它直接返回转换后的 gtable 对象，减去打印步骤。ggplot_build/ggplot_gtable这里的GH 代码： plot-build.r

以我的经验，我们可能有兴趣调整的大多数步骤都是在情节渲染阶段，并且在ggplot2:::ggplot_build.ggplot/上运行调试ggplot2:::ggplot_gtable.ggplot_built是找出事情发生的很好的第一步。

在这种情况下，运行后

debugonce(ggplot2:::ggplot_build.ggplot)

ggplot(data.frame(x = rep(c("y", "n"), 6), 
                  y = rep(c("y", "n"), each = 6)), 
       aes(x = x, y = y)) + 
  geom_tile() # no need to use the self-defined geom_tile2 here

我们开始单步执行函数：

> ggplot2:::ggplot_build.ggplot
function (plot) 
{
    plot <- plot_clone(plot)
    if (length(plot$layers) == 0) {
        plot <- plot + geom_blank()
    }
    layers <- plot$layers
    layer_data <- lapply(layers, function(y) y$layer_data(plot$data))
    scales <- plot$scales
    by_layer <- function(f) {
        out <- vector("list", length(data))
        for (i in seq_along(data)) {
            out[[i]] <- f(l = layers[[i]], d = data[[i]])
        }
        out
    }
    data <- layer_data
    data <- by_layer(function(l, d) l$setup_layer(d, plot))
    layout <- create_layout(plot$facet, plot$coordinates)
    data <- layout$setup(data, plot$data, plot$plot_env)
    data <- by_layer(function(l, d) l$compute_aesthetics(d, plot))
    data <- lapply(data, scales_transform_df, scales = scales)
    scale_x <- function() scales$get_scales("x")
    scale_y <- function() scales$get_scales("y")
    layout$train_position(data, scale_x(), scale_y())
    data <- layout$map_position(data)
    data <- by_layer(function(l, d) l$compute_statistic(d, layout))
    data <- by_layer(function(l, d) l$map_statistic(d, plot))
    scales_add_missing(plot, c("x", "y"), plot$plot_env)
    data <- by_layer(function(l, d) l$compute_geom_1(d))
    data <- by_layer(function(l, d) l$compute_position(d, layout))
    layout$reset_scales()
    layout$train_position(data, scale_x(), scale_y())
    layout$setup_panel_params()
    data <- layout$map_position(data)
    npscales <- scales$non_position_scales()
    if (npscales$n() > 0) {
        lapply(data, scales_train_df, scales = npscales)
        data <- lapply(data, scales_map_df, scales = npscales)
    }
    data <- by_layer(function(l, d) l$compute_geom_2(d))
    data <- by_layer(function(l, d) l$finish_statistics(d))
    data <- layout$finish_data(data)
    structure(list(data = data, layout = layout, plot = plot), 
        class = "ggplot_built")
}

在调试模式下，我们可以str(data[[i]])在每一步之后检查，以检查与iggplot 对象层相关的数据（i在这种情况下 = 1，因为只有 1 个几何层）。

Browse[2]> 
debug: data <- lapply(data, scales_transform_df, scales = scales)
Browse[2]> 
debug: scale_x <- function() scales$get_scales("x")
Browse[2]> str(data[[1]]) # still factor after scale_transform_df step
'data.frame':   12 obs. of  4 variables:
 $ x    : Factor w/ 2 levels "n","y": 2 1 2 1 2 1 2 1 2 1 ...
 $ y    : Factor w/ 2 levels "n","y": 2 2 2 2 2 2 1 1 1 1 ...
 $ PANEL: Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
 $ group: int  4 2 4 2 4 2 3 1 3 1 ...
  ..- attr(*, "n")= int 4

# ... omitted

debug: data <- layout$map_position(data)
Browse[2]> 
debug: data <- by_layer(function(l, d) l$compute_statistic(d, layout))
Browse[2]> str(data[[1]]) # numerical after map_position step
'data.frame':   12 obs. of  4 variables:
 $ x    : int  2 1 2 1 2 1 2 1 2 1 ...
 $ y    : int  2 2 2 2 2 2 1 1 1 1 ...
 $ PANEL: Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
 $ group: int  4 2 4 2 4 2 3 1 3 1 ...
  ..- attr(*, "n")= int 4

Stat*的setup_data触发data <- by_layer(function(l, d) l$compute_statistic(d, layout))（见ggplot2:::Layer$compute_statistic 这里），这发生在这一步之后。这就是为什么当您在中插入打印语句时StatIdentity2$setup_data，数据已经是数字形式的原因。

（并且Geom*'ssetup_data由触发data <- by_layer(function(l, d) l$compute_geom_1(d))，甚至更晚发生。）

在确定map_position事情发生的步骤之后，我们可以再次运行调试模式并进入该函数以查看到底发生了什么。在这一点上，恐怕我真的不知道你的用例是什么，所以我不得不让你自己去做。

r - ggplot2 - 天平在哪里建造？

1 回答 1

TL;博士

长解释

Related

Reference