0

使用 SQL Server 2016 R Services(数据库内)测试 RevoScaleR 包带来的并行处理。按照 Microsoft 在此处提供的示例https://docs.microsoft.com/en-us/sql/advanced-analytics/tutorials/r-tutorial-custom-r-functions?view=sql-server-2016。但是,不像文档中声称的那样,我没有看到并行性发生。有谁知道为什么?

SQL Server 是在本地安装的,有 8 个内核。在示例之上进行的唯一额外设置是:

  • 为 rxExec 设置 elemType = 'cores'。

  • 为 RxInSqlServer 设置 consoleOutput = TRUE。

我在 T-SQL 中的测试脚本是:

  EXEC sp_execute_external_script @language = N'R',  
     @script = N'
       # set up the connection string
      sqlConnString <- "Driver=SQL Server;server=.; 
                              database=master; 
                              Trusted_Connection=True"

      sqlCompute <- RxInSqlServer(connectionString = sqlConnString, consoleOutput = TRUE, numTasks= 4)
        rxSetComputeContext(sqlCompute)

        rollDice <- function()
        {
          cat(paste0("R Process ID = ", Sys.getpid(), " started at ", Sys.time()))
          cat("\n")
          result <- NULL
          point <- NULL
          count <- 1
          while (is.null(result))
          {
            roll <- sum(sample(6, 2, replace=TRUE))

            if (is.null(point))
            { point <- roll }
            if (count == 1 && (roll == 7 || roll == 11))
            {  result <- "Win" }
            else if (count == 1 && (roll == 2 || roll == 3 || roll == 12))
            { result <- "Loss" }
            else if (count > 1 && roll == 7 )
            { result <- "Loss" }
            else if (count > 1 && point == roll)
            { result <- "Win" }
            else { count <- count + 1 }
          }
          cat(paste0("R Process ID = ", Sys.getpid(), "completed at ", Sys.time()))
          cat("\n")
          result
        }

        sqlServerExec <- rxExec(rollDice, timesToRun=8, elemType = "cores", RNGseed="auto")
        return(NULL)', 
  @parallel = 1

8 次运行显然是根据控制台输出顺序执行的:

STDOUT message(s) from external script: 
======  WIN-6L7QANR32DF  ( process  1 ) has started run at  2019-08-29 11:37:10.60  ====== 
R Process ID = 7620 started at 2019-08-29 11:37:10.97 
R Process ID = 7620completed at 2019-08-29 11:37:11.03 
======  WIN-6L7QANR32DF  ( process  1 ) has completed run at  2019-08-29 11:37:11.08  ====== 
======  WIN-6L7QANR32DF  ( process  1 ) has started run at  2019-08-29 11:37:12.27  ====== 
R Process ID = 9072 started at 2019-08-29 11:37:12.80 
R Process ID = 9072completed at 2019-08-29 11:37:12.84 
======  WIN-6L7QANR32DF  ( process  1 ) has completed run at  2019-08-29 11:37:12.88  ====== 
======  WIN-6L7QANR32DF  ( process  1 ) has started run at  2019-08-29 11:37:14.29  ====== 
R Process ID = 8728 started at 2019-08-29 11:37:15.07 
R Process ID = 8728completed at 2019-08-29 11:37:15.10 
======  WIN-6L7QANR32DF  ( process  1 ) has completed run at  2019-08-29 11:37:15.15  ====== 
STDOUT message(s) from external script: 
======  WIN-6L7QANR32DF  ( process  1 ) has started run at  2019-08-29 11:37:16.31  ====== 
R Process ID = 8444 started at 2019-08-29 11:37:16.87 
R Process ID = 8444completed at 2019-08-29 11:37:16.91 
======  WIN-6L7QANR32DF  ( process  1 ) has completed run at  2019-08-29 11:37:16.97  ====== 
======  WIN-6L7QANR32DF  ( process  1 ) has started run at  2019-08-29 11:37:18.18  ====== 
R Process ID = 8244 started at 2019-08-29 11:37:18.72 
R Process ID = 8244completed at 2019-08-29 11:37:18.85 
======  WIN-6L7QANR32DF  ( process  1 ) has completed run at  2019-08-29 11:37:18.93  ====== 
======  WIN-6L7QANR32DF  ( process  1 ) has started run at  2019-08-29 11:37:20.00  ====== 
R Process ID = 2332 started at 2019-08-29 11:37:20.54 
R Process ID = 2332completed at 2019-08-29 11:37:20.59 
======  WIN-6L7QANR32DF  ( process  1 ) has completed run at  2019-08-29 11:37:20.63  ====== 
STDOUT message(s) from external script: 
======  WIN-6L7QANR32DF  ( process  1 ) has started run at  2019-08-29 11:37:21.62  ====== 
R Process ID = 336 started at 2019-08-29 11:37:22.24 
R Process ID = 336completed at 2019-08-29 11:37:22.27 
======  WIN-6L7QANR32DF  ( process  1 ) has completed run at  2019-08-29 11:37:22.32  ====== 
======  WIN-6L7QANR32DF  ( process  1 ) has started run at  2019-08-29 11:37:23.38  ====== 
R Process ID = 8280 started at 2019-08-29 11:37:23.88 
R Process ID = 8280completed at 2019-08-29 11:37:23.91 
======  WIN-6L7QANR32DF  ( process  1 ) has completed run at  2019-08-29 11:37:23.96  ====== 
4

1 回答 1

0

微软的文档似乎具有误导性。将计算上下文更改为 RxInSqlServer 似乎并不并行,而是使用 RxLocalParallel 工作。

于 2019-08-29T17:11:13.523 回答