r - R中不接受输入的替换函数

Question

这似乎与已提出的其他几个问题（例如这个）非常相关，但我无法完全弄清楚如何做我想要的。也许替换函数是错误的工作工具，这也是一个完全可以接受的答案。我对 Python 比对 R 更熟悉，我可以很容易地想到我想在 Python 中如何做，但我不太清楚如何在 R 中处理它。

问题：我试图在函数中修改一个对象，而不必返回它，但我不需要传入修改它的值，因为这个值是已经包含的函数调用的结果在对象中。

更具体地说，我有一个列表（从技术上讲，它是一个 s3 类，但我认为这实际上与这个问题无关），其中包含一些与以processx::process$new()call 开头的进程相关的内容。为了重现性，这里有一个可以运行的玩具 shell 脚本，以及获取我的res对象的代码：

echo '
echo $1
sleep 1s
echo "naw 1"
sleep 1s
echo "naw 2"
sleep 1s
echo "naw 3"
sleep 1s
echo "naw 4"
sleep 1s
echo "naw 5"
echo "All done."
' > naw.sh

然后我的包装是这样的：

run_sh <- function(.args, ...) {
  p <- processx::process$new("sh", .args, ..., stdout = "|", stderr = "2>&1")
  return(list(process = p, orig_args = .args, output = NULL))
}

res <- run_sh(c("naw.sh", "hello"))

应该res看起来像

$process
PROCESS 'sh', running, pid 19882.

$output
NULL

$orig_args
[1] "naw.sh" "hello"

所以，这里的具体问题有点特殊，process$new但我认为一般原则是相关的。我试图在完成后收集该进程的所有输出，但您只能调用process$new$read_all_output_lines()（或它的兄弟函数）一次，因为它第一次将从缓冲区返回结果，而随后它什么也不返回。另外，我将调用其中的一堆，然后再回来“检查它们”，所以我不能立即调用res$process$read_all_output_lines()，因为它会在函数返回之前等待进程完成，这不是我的想。

所以我试图存储该调用的输出，res$output然后保留它并在后续调用中返回它。Soooo ...我需要一个功能resres$output <- res$process$read_all_output_lines()来修改。

这是我尝试过的，基于这样的指导，但没有奏效。

get_output <- function(.res) {
  # check if process is still alive (as of now, can only get output from finished process)
  if (.res$process$is_alive()) {
    warning(paste0("Process ", .res$process$get_pid(), " is still running. You cannot read the output until it is finished."))
    invisible()
  } else {
    # if output has not been read from buffer, read it
    if (is.null(.res$output)) {
      output <- .res$process$read_all_output_lines()
      update_output(.res) <- output
    }
    # return output
    return(.res$output)
  }
}

`update_output<-` <- function(.res, ..., value) {
  .res$output <- value
  .res
}

get_output(res)第一次调用有效，但它不会将输出存储res$output到以后访问，因此后续调用什么也不返回。

我也尝试过这样的事情：

`get_output2<-` <- function(.res, value) {
  # check if process is still alive (as of now, can only get output from finished process)
  if (.res$process$is_alive()) {
    warning(paste0("Process ", .res$process$get_pid(), " is still running. You cannot read the output until it is finished."))
    .res
  } else {
    # if output has not been read from buffer, read it
    if (is.null(.res$output)) {
      output <- .res$process$read_all_output_lines()
      update_output(.res) <- output
    }
    # return output
    print(value)
    .res
  }
}

这只是扔掉了，value但这感觉很愚蠢，因为你必须用get_output(res) <- "fake"我讨厌的任务来调用它。

显然我也可以只返回修改后的res对象，但我不喜欢这样，因为用户必须知道该怎么做res <- get_output(res)，如果他们忘记这样做（第一次），那么输出就会丢失到以太并且永远不会恢复了。不好。

任何帮助深表感谢！

score 1 · Accepted Answer

我可能在这里遗漏了一些东西，但你为什么不在创建对象后编写输出，以便函数第一次返回时它就在那里？

run_sh <- function(.args, ...) 
{
  p <- processx::process$new("sh", .args, ..., stdout = "|", stderr = "2>&1")
  return(list(process = p, orig_args = .args, output = p$read_all_output_lines()))
}

所以现在如果你这样做

res <- run_sh(c("naw.sh", "hello"))

你得到

res
#> $`process`
#> PROCESS 'sh', finished.
#> 
#> $orig_args
#> [1] "naw.sh" "hello" 
#> 
#> $output
#>  [1] "hello"                                    
#>  [2] "naw.sh: line 2: sleep: command not found" 
#>  [3] "naw 1"                                    
#>  [4] "naw.sh: line 4: sleep: command not found" 
#>  [5] "naw 2"                                    
#>  [6] "naw.sh: line 6: sleep: command not found" 
#>  [7] "naw 3"                                    
#>  [8] "naw.sh: line 8: sleep: command not found" 
#>  [9] "naw 4"                                    
#> [10] "naw.sh: line 10: sleep: command not found"
#> [11] "naw 5"                                    
#> [12] "All done."

score 1 · Accepted Answer

在来自 OP 的进一步信息之后，看起来似乎需要一种在调用该函数的环境中写入现有变量的方法。这可以通过非标准评估来完成：

check_result <- function(process_list) 
{ 
  # Capture the name of the passed object as a string
  list_name <- deparse(substitute(process_list))

  # Check the object exists in the calling environment
  if(!exists(list_name, envir = parent.frame()))
     stop("Object '", list_name, "' not found")

  # Create a local copy of the passed object in function scope
  copy_of_process_list <- get(list_name, envir = parent.frame())

  # If the process has completed, write its output to the copy
  # and assign the copy to the name of the object in the calling frame
  if(length(copy_of_process_list$process$get_exit_status()) > 0)
  {
    copy_of_process_list$output <- copy_of_process_list$process$read_all_output_lines()
    assign(list_name, copy_of_process_list, envir = parent.frame()) 
  }
  print(copy_of_process_list)
}

res如果该过程已完成，这将更新；否则它就不管它。无论哪种情况，它都会打印出当前内容。如果这是面向客户端的代码，您将需要对传入的对象进行进一步的类型检查逻辑。

所以我可以

res <- run_sh(c("naw.sh", "hello"))

并检查res我的内容：

res
#> $`process`
#> PROCESS 'sh', running, pid 1112.
#> 
#> $orig_args
#> [1] "naw.sh" "hello" 
#> 
#> $output
#> NULL

如果我立即运行：

check_result(res)
#> $`process`
#> PROCESS 'sh', running, pid 1112.
#> 
#> $orig_args
#> [1] "naw.sh" "hello" 
#> 
#> $output
#> NULL

我们可以看到该过程尚未完成。但是，如果我等几秒钟再打电话check_result，我会得到：

check_result(res)
#> $`process`
#> PROCESS 'sh', finished.
#> 
#> $orig_args
#> [1] "naw.sh" "hello" 
#> 
#> $output
#> [1] "hello"     "naw 1"     "naw 2"     "naw 3"     "naw 4"     "naw 5"    
#> [7] "All done."

并且没有明确写入 res，它已通过函数更新：

res
#> $`process`
#> PROCESS 'sh', finished.
#> 
#> $orig_args
#> [1] "naw.sh" "hello" 
#> 
#> $output
#> [1] "hello"     "naw 1"     "naw 2"     "naw 3"     "naw 4"     "naw 5"    
#> [7] "All done."

r - R中不接受输入的替换函数

2 回答 2

Related

Reference