0

I am trying to use DASK for fast computing as logistic regression aborted after 17 hours on my system. My data set is about 1 million rows.

I first ran these commands:

import dask.array as da
import dask.dataframe as dd
from dask.distributed import Client 
client = Client() 
from dask.distributed import Client 
client = Client()

The above commands ran but through a warning:

C:\ProgramData\Anaconda3\lib\site-packages\distributed\bokeh\core.py:57: UserWarning: Port 8787 is already in use. Perhaps you already have a cluster running? Hosting the diagnostics dashboard on a random port instead. warnings.warn('\n' + msg)

Then I ran these commands:

import dask_ml.joblib
from sklearn.externals import joblib

Error: AttributeError: module 'dask.array' has no attribute 'blockwise'

Can anyone help me with how to resolve this?

4

1 回答 1

2

您不应该设置两个本地集群,这就是调用Client()两次将为您做的事情 - 这就是您看到警告和端口不可用的原因。

错误:AttributeError:模块“dask.array”没有属性“blockwise”

我可以向您保证,该模块确实是 dask 的一部分,因此这表明您可能没有正确设置环境。如果没有关于你如何安装东西以及你安装了哪些版本的更多细节,很难说更多。你跑了client.get_versions()吗?

于 2020-06-01T18:42:56.123 回答