我是 spark 的初学者,我陷入了如何使用数据框发出 sql 请求。
我有以下两个数据框。
dataframe_1
+-----------------+-----------------+----------------------+---------------------+
|id |geometry_tyme |geometry |rayon |
+-----------------+-----------------+----------------------+---------------------+
|50 |Polygon |[00 00 00 00 01 0...] |200 |
|54 |Point |[00 00 00 00 01 0.. ] |320179 |
+-----------------+-----------------+----------------------+---------------------+
dataframe_2
+-----------------+-----------------+----------------------+
|id2 |long |lat |
+-----------------+-----------------+----------------------+
|[70,50,600,] | -9.6198783 |44.5942549 |
|[20,140,39,] |-6.6198783 |44.5942549 |
+-----------------+-----------------+----------------------+
我想执行以下请求。
"SELECT dataframe_1.* FROM dataframe_1 WHERE dataframe_1.id IN ("
+ id2
+ ") AND ((dataframe_1.geometry_tyme='Polygon' AND (ST_WITHIN(ST_GeomFromText(CONCAT('POINT(',"
+ long
+ ",' ',"
+ lat
+ ",')'),4326),dataframe_1.geometry))) OR ( (dataframe_1.geometry_tyme='LineString' OR dataframe_1.geomType='Point') AND ST_Intersects(ST_buffer(dataframe_1.geom,(dataframe_1.rayon/100000)),ST_GeomFromText(CONCAT('POINT(',"
+ long
+ ",' ',"
+ lat
+ ",')'),4326)))) "
我真的卡住了,我应该加入两个数据框还是什么?我尝试像这样使用 id 和 idZone 加入两个数据框:
dataframe_2.select(explode(col("id2").as ("id2"))).join(dataframe_1,col("id2").equalTo(dataframe_1.col("id")));
但在我看来,加入并不是正确的选择。
我需要你帮忙 。
谢谢