0

我正在尝试使用 Dask python 包加入一些地理数据框。在实现我的数据处理算法时,我遇到了下一个异常:AttributeError: 'DataFrame' object has no attribute '_example'

这是我的代码:

import dask.dataframe as dd
import dask_geopandas as dg
import pandas as pd
import dask


df1= gpd.read_file("shapefile1.shp")
df2= gpd.read_file("shapefile2.shp")

df1= dd.from_pandas(df1, npartitions=1)
df2= dd.from_pandas(df2, npartitions=1)

gf2 = dg.sjoin(df1, df2, how='inner', op='intersects')

这是我的堆栈跟踪:

Traceback (most recent call last):
  File "test.py", line 21, in <module>
    gf2 = dg.sjoin(df1, df2, how='inner', op='intersects')
  File "/usr/local/lib/python3.6/dist-packages/dask_geopandas-0.0.1-py3.6.egg/dask_geopandas/core.py", line 413, in sjoin
    example = gpd.tools.sjoin(left._example, right._example, how=how, op=op)
  File "/home/mapseeuser/.local/lib/python3.6/site-packages/dask/dataframe/core.py", line 2414, in __getattr__
    raise AttributeError("'DataFrame' object has no attribute %r" % key)
AttributeError: 'DataFrame' object has no attribute '_example'

那么,谁能告诉我我做错了什么以及如何使用 Dask 包库连接两个数据集。

4

1 回答 1

0

Python 包库

sudo pip install dask[dataframe]
sudo pip install geopandas

试试这个代码:

import dask.dataframe as dd
import geopandas as gpd
import pandas as pd
import dask

df1= gpd.read_file("shapefile1.shp")
df2= gpd.read_file("shapefile2.shp")

df1= dd.from_pandas(df1, npartitions=1)
df2= dd.from_pandas(df2, npartitions=1)

gf2 = gpd.sjoin(df1, df2, how='inner', op='intersects')
于 2018-05-25T10:01:33.870 回答