我在尝试使用%dopar%
并foreach()
与R6
类一起使用时遇到了问题。环顾四周,我只能找到两个与此相关的资源,一个未回答的SO 问题和一个在存储库上的开放GitHub 问题。R6
在一条评论(即 GitHub 问题)中,建议通过将parent_env
类的 重新分配为SomeClass$parent_env <- environment()
. 我想了解在of中调用environment()
此表达式(即 )时究竟指的是什么?SomeClass$parent_env <- environment()
%dopar%
foreach
这是一个最小的可重现示例:
Work <- R6::R6Class("Work",
public = list(
values = NULL,
initialize = function() {
self$values <- "some values"
}
)
)
现在,下面Task
的类使用Work
构造函数中的类。
Task <- R6::R6Class("Task",
private = list(
..work = NULL
),
public = list(
initialize = function(time) {
private$..work <- Work$new()
Sys.sleep(time)
}
),
active = list(
work = function() {
return(private$..work)
}
)
)
在Factory
类Task
中,创建类并foreach
在..m.thread()
.
Factory<- R6::R6Class("Factory",
private = list(
..warehouse = list(),
..amount = NULL,
..parallel = NULL,
..m.thread = function(object, ...) {
cluster <- parallel::makeCluster(parallel::detectCores() - 1)
doParallel::registerDoParallel(cluster)
private$..warehouse <- foreach::foreach(1:private$..amount, .export = c("Work")) %dopar% {
# What exactly does `environment()` encapsulate in this context?
object$parent_env <- environment()
object$new(...)
}
parallel::stopCluster(cluster)
},
..s.thread = function(object, ...) {
for (i in 1:private$..amount) {
private$..warehouse[[i]] <- object$new(...)
}
},
..run = function(object, ...) {
if(private$..parallel) {
private$..m.thread(object, ...)
} else {
private$..s.thread(object, ...)
}
}
),
public = list(
initialize = function(object, ..., amount = 10, parallel = FALSE) {
private$..amount = amount
private$..parallel = parallel
private$..run(object, ...)
}
),
active = list(
warehouse = function() {
return(private$..warehouse)
}
)
)
然后,它被称为:
library(foreach)
x = Factory$new(Task, time = 2, amount = 10, parallel = TRUE)
如果没有以下行object$parent_env <- environment()
,则会引发错误(即,如其他两个链接中所述):Error in { : task 1 failed - "object 'Work' not found"
.
我想知道,(1)分配parent_env
内部时有哪些潜在的陷阱foreach
,(2)为什么它首先起作用?
更新1:
- 我
environment()
从内部返回foreach()
,private$..warehouse
捕捉那些环境 - 在调试会话中使用
rlang::env_print()
(即,browser()
语句在执行结束后立即放置foreach
)它们的组成如下:
Browse[1]> env_print(private$..warehouse[[1]])
# <environment: 000000001A8332F0>
# parent: <environment: global>
# bindings:
# * Work: <S3: R6ClassGenerator>
# * ...: <...>
Browse[1]> env_print(environment())
# <environment: 000000001AC0F890>
# parent: <environment: 000000001AC20AF0>
# bindings:
# * private: <env>
# * cluster: <S3: SOCKcluster>
# * ...: <...>
Browse[1]> env_print(parent.env(environment()))
# <environment: 000000001AC20AF0>
# parent: <environment: global>
# bindings:
# * private: <env>
# * self: <S3: Factory>
Browse[1]> env_print(parent.env(parent.env(environment())))
# <environment: global>
# parent: <environment: package:rlang>
# bindings:
# * Work: <S3: R6ClassGenerator>
# * .Random.seed: <int>
# * Factory: <S3: R6ClassGenerator>
# * Task: <S3: R6ClassGenerator>