使用 SQL Server 2016 R Services(数据库内)测试 RevoScaleR 包带来的并行处理。按照 Microsoft 在此处提供的示例https://docs.microsoft.com/en-us/sql/advanced-analytics/tutorials/r-tutorial-custom-r-functions?view=sql-server-2016。但是,不像文档中声称的那样,我没有看到并行性发生。有谁知道为什么?
SQL Server 是在本地安装的,有 8 个内核。在示例之上进行的唯一额外设置是:
为 rxExec 设置 elemType = 'cores'。
为 RxInSqlServer 设置 consoleOutput = TRUE。
我在 T-SQL 中的测试脚本是:
EXEC sp_execute_external_script @language = N'R',
@script = N'
# set up the connection string
sqlConnString <- "Driver=SQL Server;server=.;
database=master;
Trusted_Connection=True"
sqlCompute <- RxInSqlServer(connectionString = sqlConnString, consoleOutput = TRUE, numTasks= 4)
rxSetComputeContext(sqlCompute)
rollDice <- function()
{
cat(paste0("R Process ID = ", Sys.getpid(), " started at ", Sys.time()))
cat("\n")
result <- NULL
point <- NULL
count <- 1
while (is.null(result))
{
roll <- sum(sample(6, 2, replace=TRUE))
if (is.null(point))
{ point <- roll }
if (count == 1 && (roll == 7 || roll == 11))
{ result <- "Win" }
else if (count == 1 && (roll == 2 || roll == 3 || roll == 12))
{ result <- "Loss" }
else if (count > 1 && roll == 7 )
{ result <- "Loss" }
else if (count > 1 && point == roll)
{ result <- "Win" }
else { count <- count + 1 }
}
cat(paste0("R Process ID = ", Sys.getpid(), "completed at ", Sys.time()))
cat("\n")
result
}
sqlServerExec <- rxExec(rollDice, timesToRun=8, elemType = "cores", RNGseed="auto")
return(NULL)',
@parallel = 1
8 次运行显然是根据控制台输出顺序执行的:
STDOUT message(s) from external script:
====== WIN-6L7QANR32DF ( process 1 ) has started run at 2019-08-29 11:37:10.60 ======
R Process ID = 7620 started at 2019-08-29 11:37:10.97
R Process ID = 7620completed at 2019-08-29 11:37:11.03
====== WIN-6L7QANR32DF ( process 1 ) has completed run at 2019-08-29 11:37:11.08 ======
====== WIN-6L7QANR32DF ( process 1 ) has started run at 2019-08-29 11:37:12.27 ======
R Process ID = 9072 started at 2019-08-29 11:37:12.80
R Process ID = 9072completed at 2019-08-29 11:37:12.84
====== WIN-6L7QANR32DF ( process 1 ) has completed run at 2019-08-29 11:37:12.88 ======
====== WIN-6L7QANR32DF ( process 1 ) has started run at 2019-08-29 11:37:14.29 ======
R Process ID = 8728 started at 2019-08-29 11:37:15.07
R Process ID = 8728completed at 2019-08-29 11:37:15.10
====== WIN-6L7QANR32DF ( process 1 ) has completed run at 2019-08-29 11:37:15.15 ======
STDOUT message(s) from external script:
====== WIN-6L7QANR32DF ( process 1 ) has started run at 2019-08-29 11:37:16.31 ======
R Process ID = 8444 started at 2019-08-29 11:37:16.87
R Process ID = 8444completed at 2019-08-29 11:37:16.91
====== WIN-6L7QANR32DF ( process 1 ) has completed run at 2019-08-29 11:37:16.97 ======
====== WIN-6L7QANR32DF ( process 1 ) has started run at 2019-08-29 11:37:18.18 ======
R Process ID = 8244 started at 2019-08-29 11:37:18.72
R Process ID = 8244completed at 2019-08-29 11:37:18.85
====== WIN-6L7QANR32DF ( process 1 ) has completed run at 2019-08-29 11:37:18.93 ======
====== WIN-6L7QANR32DF ( process 1 ) has started run at 2019-08-29 11:37:20.00 ======
R Process ID = 2332 started at 2019-08-29 11:37:20.54
R Process ID = 2332completed at 2019-08-29 11:37:20.59
====== WIN-6L7QANR32DF ( process 1 ) has completed run at 2019-08-29 11:37:20.63 ======
STDOUT message(s) from external script:
====== WIN-6L7QANR32DF ( process 1 ) has started run at 2019-08-29 11:37:21.62 ======
R Process ID = 336 started at 2019-08-29 11:37:22.24
R Process ID = 336completed at 2019-08-29 11:37:22.27
====== WIN-6L7QANR32DF ( process 1 ) has completed run at 2019-08-29 11:37:22.32 ======
====== WIN-6L7QANR32DF ( process 1 ) has started run at 2019-08-29 11:37:23.38 ======
R Process ID = 8280 started at 2019-08-29 11:37:23.88
R Process ID = 8280completed at 2019-08-29 11:37:23.91
====== WIN-6L7QANR32DF ( process 1 ) has completed run at 2019-08-29 11:37:23.96 ======