我正在尝试使用/读取单个大parquet
文件(大小> gpu_size),但它当前正在将其读入单个分区,我猜这是从文档字符串推断的预期行为:dask_cudf
dask
dask.dataframe.read_parquet(path, columns=None, filters=None, categories=None, index=None, storage_options=None, engine='auto', gather_statistics=None, **kwargs):
Read a Parquet file into a Dask DataFrame
This reads a directory of Parquet data into a Dask.dataframe, one file per partition.
It selects the index among the sorted columns if any exist.
有没有一种解决方法我可以将它读入多个分区?