3

t是一个 dask 数组。我想绘制一个直方图t。Dask 文档有方法

dask.array.histogram(a, bins=None, range=None, normed=False, weights=None, density=None)

但没有例子。我尝试bins使用 numpy 数组进行设置。没用。我试过使用matplotlib.pyplot它花了超过 5 分钟并且没有产生任何东西(我的数据集非常大(GB 大小),但这似乎很长一段时间)。

4

2 回答 2

2

该库hvplot链接)可以在 Dask DataFrame 上绘制直方图。是一个例子。

以下是伪代码。dd是一个 Dask DataFrame,并为具有名称的特征绘制直方图feature_one

import hvplot.dask

dd.hvplot.hist(y="feature_one")

建议使用 conda 安装该库:

conda install -c conda-forge hvplot
于 2020-11-19T04:33:08.590 回答
2

Dask.array.histogram 需要binsrange分别设置所需的 bin 数量和数据的最小/最大范围。这是一个简单的例子:

In [1]: import dask.array as da

In [2]: x = da.random.normal(10, 0.1, size=(100000,), chunks=(1000,))  # random dataset 

In [3]: h, bins = da.histogram(x, bins=100, range=[9, 11])

In [4]: bins
Out[4]: 
array([  9.  ,   9.02,   9.04,   9.06,   9.08,   9.1 ,   9.12,   9.14,
         9.16,   9.18,   9.2 ,   9.22,   9.24,   9.26,   9.28,   9.3 ,
         9.32,   9.34,   9.36,   9.38,   9.4 ,   9.42,   9.44,   9.46,
         9.48,   9.5 ,   9.52,   9.54,   9.56,   9.58,   9.6 ,   9.62,
         9.64,   9.66,   9.68,   9.7 ,   9.72,   9.74,   9.76,   9.78,
         9.8 ,   9.82,   9.84,   9.86,   9.88,   9.9 ,   9.92,   9.94,
         9.96,   9.98,  10.  ,  10.02,  10.04,  10.06,  10.08,  10.1 ,
        10.12,  10.14,  10.16,  10.18,  10.2 ,  10.22,  10.24,  10.26,
        10.28,  10.3 ,  10.32,  10.34,  10.36,  10.38,  10.4 ,  10.42,
        10.44,  10.46,  10.48,  10.5 ,  10.52,  10.54,  10.56,  10.58,
        10.6 ,  10.62,  10.64,  10.66,  10.68,  10.7 ,  10.72,  10.74,
        10.76,  10.78,  10.8 ,  10.82,  10.84,  10.86,  10.88,  10.9 ,
        10.92,  10.94,  10.96,  10.98,  11.  ])

In [5]: h.compute()
Out[5]: 
array([   0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    1,    1,    4,   15,
         19,   71,  132,  231,  376,  604,  891, 1307, 1884, 2635, 3422,
       4276, 5455, 6158, 7092, 7759, 7933, 7994, 7625, 6994, 6194, 5315,
       4272, 3381, 2529, 1803, 1324,  912,  594,  331,  225,  127,   54,
         32,   12,   10,    2,    2,    1,    1,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0])
于 2016-06-28T21:46:48.757 回答