r - S4方法调度慢吗？

Question

我的 S4 类有一个被多次调用的方法。我注意到执行时间比独立调用类似函数时要慢得多。所以我在我的类中添加了一个类型为“function”的插槽，并使用该函数而不是方法。下面的示例显示了两种方法，它们都比相应的方法运行得快得多。此外，该示例表明该方法的较低速度不是由于方法必须从类中检索数据，因为即使它们也这样做，函数也更快。

当然，这种做事方式并不理想。我想知道是否有一种方法可以加速方法调度。有什么建议么？

    setClass(Class = "SpeedTest", 
      representation = representation(
        x = "numeric",
        foo1 = "function",
        foo2 = "function"
      )
    )

    speedTest <- function(n) {
      new("SpeedTest",
        x = rnorm(n),
        foo1 = function(z) sqrt(abs(z)),
        foo2 = function() {}
      )
    }

    setGeneric(
      name = "method.foo",
      def = function(object) {standardGeneric("method.foo")}
    )
    setMethod(
      f = "method.foo", 
      signature = "SpeedTest",
      definition = function(object) {
        sqrt(abs(object@x))
      }
    )

    setGeneric(
      name = "create.foo2",
      def = function(object) {standardGeneric("create.foo2")}
    )
    setMethod(
      f = "create.foo2", 
      signature = "SpeedTest",
      definition = function(object) {
        z <- object@x
        object@foo2 <- function() sqrt(abs(z))

        object
      }
    )

    > st <- speedTest(1000)
    > st <- create.foo2(st)
    > 
    > iters <- 100000
    > 
    > system.time(for (i in seq(iters)) method.foo(st)) # slowest by far
       user  system elapsed 
       3.26    0.00    3.27 

    > # much faster 
    > system.time({foo1 <- st@foo1; x <- st@x; for (i in seq(iters)) foo1(x)}) 
       user  system elapsed 
      1.47    0.00    1.46 

    > # retrieving st@x instead of x does not affect speed
    > system.time({foo1 <- st@foo1; for (i in seq(iters)) foo1(st@x)}) 
       user  system elapsed 
       1.47    0.00    1.49 

    > # same speed as foo1 although no explicit argument
    > system.time({foo2 <- st@foo2; for (i in seq(iters)) foo2()}) 
       user  system elapsed 
       1.44    0.00    1.45 

     # Cannot increase speed by using a lambda to "eliminate" the argument of method.foo
     > system.time({foo <- function() method.foo(st); for (i in seq(iters)) foo()})  
        user  system elapsed 
        3.28    0.00    3.29

score 14 · Accepted Answer

成本在于方法查找，它在您的时间的每次迭代中从头开始。这可以通过计算一次方法调度来缩短

METHOD <- selectMethod(method.foo, class(st))
for (i in seq(iters)) METHOD(st)

这个（更好的方法查找）将是一个非常有趣和值得的项目；在其他动态语言中学到了宝贵的经验教训，例如，维基百科的动态调度页面上提到的内联缓存。

我想知道您进行许多方法调用的原因是否是因为您的数据表示和方法的向量化不完整？

score 6 · Accepted Answer

这并不能直接帮助您解决问题，但是使用 microbenchmark 包对此类内容进行基准测试要容易得多：

f <- function(x) NULL

s3 <- function(x) UseMethod("s3")
s3.integer <- function(x) NULL

A <- setClass("A", representation(a = "list"))
setGeneric("s4", function(x) standardGeneric("s4"))
setMethod(s4, "A", function(x) NULL)

B <- setRefClass("B")
B$methods(r5 = function(x) NULL)

a <- A()
b <- B$new()

library(microbenchmark)
options(digits = 3)
microbenchmark(
  bare = NULL,
  fun = f(),
  s3 = s3(1L),
  s4 = s4(a),
  r5 = b$r5()
)
# Unit: nanoseconds
#  expr   min    lq median    uq   max neval
#  bare    13    20     22    29    36   100
#   fun   171   236    270   310   805   100
#    s3  2025  2478   2651  2869  8603   100
#    s4 10017 11029  11528 11905 36149   100
#    r5  9080 10003  10390 10804 61864   100

在我的电脑上，裸调用大约需要 20 ns。将它包装在一个函数中会增加大约 200 ns - 这是创建函数执行发生的环境的成本。S3 方法调度增加了大约 3 µs 和 S4/ref 类大约 12 µs。

r - S4方法调度慢吗？

2 回答 2

Related

Reference