-1

我正在尝试dask.cluster.Kmeans使用大量数据启动。使用 CPU 是可以的,因为我numpydask.array. 由于未在cupy.

我试图重现关于从 CuPy 随机生成器生成随机 dask 数组的Mattew Rocklin 示例( https://blog.dask.org/2019/01/03/dask-array-gpus-first-steps ) - 它有效,但这不是我想使用的情况。

cupy用-包裹dask.array不起作用。

>>> import dask.array as da
>>> import cupy as cp
>>> da.from_array(cp.arange(100000)).sum().compute()

我期望这个数组的总和,但得到以下错误:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/base.py", line 175, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/base.py", line 446, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/threaded.py", line 82, in get
    **kwargs
  File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/local.py", line 491, in get_async
    raise_exception(exc, tb)
  File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/compatibility.py", line 130, in reraise
    raise exc
  File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/local.py", line 233, in execute_task
    result = _execute_task(task, data)
  File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/core.py", line 119, in _execute_task
    return func(*args2)
  File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/array/core.py", line 100, in getter
    c = np.asarray(c)
  File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/numpy/core/numeric.py", line 538, in asarray
    return array(a, dtype, copy=False, order=order)
ValueError: object __array__ method not producing an array

那么我如何通过 dask 数组管理 CuPy 的工作呢?

4

1 回答 1

3

从 CuPy 数组创建 Dask 数组时,您需要提供da.from_array关键字参数asarray=False。因此,您的代码将如下所示。

>>> import dask.array as da
>>> import cupy as cp
>>> da.from_array(cp.arange(100000), asarray=False).sum().compute()
于 2019-06-27T20:38:35.277 回答