0

我正在做一个非常简单的数据转换,Dask_ML我收到了这个错误,我想知道是否有人遇到过这个问题。看起来像可以修改的系统设置?

df.head()

ReportDate_Time 2015-05-01  2015-06-01  2015-07-01  2015-08-01  2015-09-01  2015-10-01  2015-11-01  2015-12-01  2016-01-01  2016-02-01  ... 2017-04-01  2017-05-01  2017-06-01  2017-08-01  2017-09-01  2017-10-01  2017-11-01  2017-12-01  2018-01-01  2018-02-01
0   33.0    32.0    32.0    24.0    24.0    32.0    31.0    31.0    31.0    35.0    ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1   670.0   978.0   1061.0  1048.0  525.0   936.0   804.0   1067.0  859.0   647.0   ... 407.0   457.0   517.0   388.0   345.0   428.0   688.0   1486.0  1090.0  0.0
2   132.0   130.0   127.0   137.0   92.0    96.0    112.0   124.0   126.0   112.0   ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
3   206.0   259.0   207.0   181.0   74.0    142.0   125.0   172.0   138.0   194.0   ... 188.0   211.0   239.0   179.0   159.0   197.0   318.0   315.0   189.0   190.0
4   290.0   311.0   401.0   381.0   138.0   315.0   275.0   407.0   408.0   419.0   ... 76.0    56.0    159.0   16.0    0.0 0.0 213.0   123.0   4.0 3.0
5 rows × 33 columns

from dask_ml.preprocessing import StandardScaler

scaler = StandardScaler().fit_transform(df_pivot)

Gives Error:

CancelledError: 
distributed.client - ERROR - Failed to reconnect to scheduler after 10.00 seconds, closing client
distributed.utils - ERROR - 
Traceback (most recent call last):
  File "/home/user/.local/lib/python3.6/site-packages/distributed/utils.py", line 662, in log_errors
    yield
  File "/home/user/.local/lib/python3.6/site-packages/distributed/client.py", line 1290, in _close
    await gen.with_timeout(timedelta(seconds=2), list(coroutines))
concurrent.futures._base.CancelledError
distributed.utils - ERROR - 
Traceback (most recent call last):
  File "/home/user/.local/lib/python3.6/site-packages/distributed/utils.py", line 662, in log_errors
    yield
  File "/home/user/.local/lib/python3.6/site-packages/distributed/client.py", line 1019, in _reconnect
    await self._close()
  File "/home/user/.local/lib/python3.6/site-packages/distributed/client.py", line 1290, in _close
    await gen.with_timeout(timedelta(seconds=2), list(coroutines))
concurrent.futures._base.CancelledError

有任何想法吗?

4

2 回答 2

1

您可以使用以下内容在 dask.distributed 中配置超时:

distributed:
  comm:
    timeouts:
      tcp: 50s              # time before calling an unresponsive connection dead

或在代码中:

import dask
import distributed
dask.config.set({"distributed.comm.timeouts.tcp": "50s"})
于 2020-02-06T19:45:36.893 回答
0

自@quasiben 的回答以来,代码可能已经改变。根据客户端的构造函数,配置名称不同。

可以明确设置超时:

Client(..., timeout="50s")

或在配置中(“连接”而不是“tcp”)

import dask
import distributed
dask.config.set({"distributed.comm.timeouts.connect": "50s"})
于 2020-06-01T03:53:45.453 回答