1

以下脚本运行良好是小样本

EXECUTE sp_execute_external_script
@language = N'Python',
@script = N'
print(df_training["flResp"].value_counts())',
@input_data_1 = N'SELECT * FROM tb_training_teste',
@input_data_1_name = N'df_training';

我测试了8419条记录,结果还可以,如下:

Mensagem(ns) STDOUT do script externo:
0 4964
1 3452
9 3
Name: flResp, dtype: int64

但是,我的原始表有超过 500,000 条记录,由于以下错误,我无法运行。有人可以帮助找出问题所在吗?以及如何解决?

Error in execution. Check the output for more information.
MemoryError

SqlSatelliteCall error: Error in execution. Check the output for more information.
Mensagem(ns) STDOUT do script externo:
SqlSatelliteCall function failed. Please see the console output for more information.
Traceback (most recent call last):
File "C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\PYTHON_SERVICES\lib\site-packages\revoscalepy\computecontext\RxInSqlServer.py", line 587, in rx_sql_satellite_call
rx_native_call("SqlSatelliteCall", params)
File "C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\PYTHON_SERVICES\lib\site-packages\revoscalepy\RxSerializable.py", line 358, in rx_native_call
ret = px_call(functionname, params)
RuntimeError: revoscalepy function failed.
4

1 回答 1

0

我刚刚遇到了这个问题,所以我会添加我的发现以防将来对某人有所帮助。

似乎 SQL Server 使用资源池来限制外部进程(如 Python 和 R)可用的资源。可以通过以下方式查看分配情况

SELECT * FROM sys.resource_governor_external_resource_pools

默认情况下,将有一个池(称为default),amax_memory_percent为 20。可以使用ALTER EXTERNAL RESOURCE POOL命令增加,例如:

ALTER EXTERNAL RESOURCE POOL [default]
WITH (
    MAX_MEMORY_PERCENT = 95
)
GO

ALTER RESOURCE GOVERNOR RECONFIGURE
GO

或者,可以在 SQL Server Management Studio 中更改它: 在此处输入图像描述

于 2021-09-07T04:37:27.133 回答