我可以从 ggpplot2 ggproto 中的“map_data”访问“setup_data”的结果吗?
(适用于“compute_layout”但不适用于“map_data”)
嗨伙计。我正在开发一个 ggplot2 扩展,它将实现一种新的分面方法。
我不想深入了解算法的本质,但我只想说我需要首先为 input 的每一行计算一些新列,data
然后才能执行compute_layout
and map_data
。
当然,一种选择是计算我的新列两次,一次在内部,一次compute_layout
在内部map_data
,但这将是计算成本的两倍,而且不那么优雅。
似乎setup_params
并且setup_data
适用于这个确切的用例。
什么不起作用❌</h1>
我正在根据这个伟大的小插图创建一个可重复的小例子。
我刚刚做了一个小的修改,尝试使用该函数hello
向数据中添加一列。setup_data
library(ggplot2)
facet_bootstrap <- function(n = 9, prop = 0.2, nrow = NULL, ncol = NULL,
scales = "fixed", shrink = TRUE, strip.position = "top") {
facet <- facet_wrap(~.bootstrap, nrow = nrow, ncol = ncol, scales = scales,
shrink = shrink, strip.position = strip.position)
facet$params$n <- n
facet$params$prop <- prop
ggproto(NULL, FacetBootstrap,
shrink = shrink,
params = facet$params
)
}
FacetBootstrap <- ggproto("FacetBootstrap", FacetWrap,
setup_data = function(data, params){
data[[1]]$hello <- 'world'
print("In SETUP_DATA:")
print(" names(data):")
print(names(data[[1]]))
print("")
data
},
compute_layout = function(data, params) {
id <- seq_len(params$n)
print("In COMPUTE_LAYOUT:")
print(" names(data):")
print(names(data[[1]]))
print("")
dims <- wrap_dims(params$n, params$nrow, params$ncol)
layout <- data.frame(PANEL = factor(id))
if (params$as.table) {
layout$ROW <- 1+as.integer((id - 1L) %/% dims[2] + 1L)
} else {
layout$ROW <- 1+as.integer(dims[1] - (id - 1L) %/% dims[2])
}
layout$COL <- 2+as.integer((id - 1L) %% dims[2] + 1L)
layout <- layout[order(layout$PANEL), , drop = FALSE]
rownames(layout) <- NULL
# Add scale identification
layout$SCALE_X <- if (params$free$x) id else 1L
layout$SCALE_Y <- if (params$free$y) id else 1L
cbind(layout, .bootstrap = id)
},
map_data = function(data, layout, params) {
print("In MAP_DATA:")
print(" names(data):")
print(names(data))
print("")
if (is.null(data) || nrow(data) == 0) {
return(cbind(data, PANEL = integer(0)))
}
n_samples <- round(nrow(data) * params$prop)
new_data <- lapply(seq_len(params$n), function(i) {
cbind(data[sample(nrow(data), n_samples), , drop = FALSE], PANEL = i)
})
do.call(rbind, new_data)
}
)
ggplot(diamonds, aes(carat, price)) +
geom_point(alpha = 0.1) +
facet_bootstrap(n = 9, prop = 0.05)
输出:
[1] "In SETUP_DATA:"
[1] " names(data):"
[1] "carat" "cut" "color" "clarity" "depth" "table"
[7] "price" "x" "y" "z" "hello"
[1] ""
[1] "In COMPUTE_LAYOUT:"
[1] " names(data):"
[1] "carat" "cut" "color" "clarity" "depth" "table"
[7] "price" "x" "y" "z" "hello"
[1] ""
[1] "In MAP_DATA:"
[1] " names(data):"
[1] "carat" "cut" "color" "clarity" "depth" "table"
[7] "price" "x" "y" "z"
[1] ""
注意我的hello
专栏是如何可用的,compute_layout
但不是map_data
什么工作✅</h1>
作为一种解决方法,我可以创建一些列并将它们作为parameters
using传递setup_params
。这有点粗俗,因为它们在概念上不是“参数”,它们是数据。但如果一切都失败了——我会采取这种方法
library(ggplot2)
facet_bootstrap <- function(n = 9, prop = 0.2, nrow = NULL, ncol = NULL,
scales = "fixed", shrink = TRUE, strip.position = "top") {
facet <- facet_wrap(~.bootstrap, nrow = nrow, ncol = ncol, scales = scales,
shrink = shrink, strip.position = strip.position)
facet$params$n <- n
facet$params$prop <- prop
ggproto(NULL, FacetBootstrap,
shrink = shrink,
params = facet$params
)
}
FacetBootstrap <- ggproto("FacetBootstrap", FacetWrap,
setup_params = function(data, params){
params$hello <- 'world'
print("In SETUP_DATA:")
print(" params$hello:")
print(params$hello)
print("")
params
},
compute_layout = function(data, params) {
id <- seq_len(params$n)
print("In COMPUTE_LAYOUT:")
print(" params$hello:")
print(params$hello)
print("")
dims <- wrap_dims(params$n, params$nrow, params$ncol)
layout <- data.frame(PANEL = factor(id))
if (params$as.table) {
layout$ROW <- 1+as.integer((id - 1L) %/% dims[2] + 1L)
} else {
layout$ROW <- 1+as.integer(dims[1] - (id - 1L) %/% dims[2])
}
layout$COL <- 2+as.integer((id - 1L) %% dims[2] + 1L)
layout <- layout[order(layout$PANEL), , drop = FALSE]
rownames(layout) <- NULL
# Add scale identification
layout$SCALE_X <- if (params$free$x) id else 1L
layout$SCALE_Y <- if (params$free$y) id else 1L
cbind(layout, .bootstrap = id)
},
map_data = function(data, layout, params) {
print("In MAP_DATA:")
print(" params$hello:")
print(params$hello)
print("")
if (is.null(data) || nrow(data) == 0) {
return(cbind(data, PANEL = integer(0)))
}
n_samples <- round(nrow(data) * params$prop)
new_data <- lapply(seq_len(params$n), function(i) {
cbind(data[sample(nrow(data), n_samples), , drop = FALSE], PANEL = i)
})
do.call(rbind, new_data)
}
)
ggplot(diamonds, aes(carat, price)) +
geom_point(alpha = 0.1) +
facet_bootstrap(n = 9, prop = 0.05)
具有以下输出
[1] "In SETUP_DATA:"
[1] " params$hello:"
[1] "world"
[1] ""
[1] "In COMPUTE_LAYOUT:"
[1] " params$hello:"
[1] "world"
[1] ""
[1] "In MAP_DATA:"
[1] " params$hello:"
[1] "world"
[1] ""
结果总结
- 当我从“setup_data”修改数据时,我可以在“compute_layout”中访问它✅</li>
- 当我从“setup_data”修改数据时,我无法在“map_data”中访问它❌</li>
- 当我从“setup_params”修改数据时,我可以在“compute_layout”中访问它✅</li>
- 当我从“setup_params”修改数据时,我可以在“map_data”中访问它✅</li>
最后的问题
- 上述结果是预期的,还是我只是想错了事情?
- 为
data
ONCE 计算新列,然后将它们同时用于map_data
AND的理想方法compute_layout
是Facet
ggproto
什么?
提前致谢!