0

    +--------------------+--------------------+
|              _VALUE|             paraarr|
+--------------------+--------------------+
|Archer, Edward Pa...|[, [[, Arbitrator...|
|Archer, Edward Pa...|[, [[, Member:],,...|
|Archer, Edward Pa...|[, [[, Experience...|
|Archer, Edward Pa...|[, [[, Publicatio...|
|Belcher, A. Lee (...|[, [[, Arbitrator...|
|Belcher, A. Lee (...|[, [[, Member:],,...|
|Belcher, A. Lee (...|[, [[, Experience...|
|Bloodsworth, Davi...|[, [[, Arbitrator...|
|Bloodsworth, Davi...|[, [[, Member:],,...|
|Bloodsworth, Davi...|[, [[, Experience...|
|Bloodsworth, Davi...|[, [[, Public Sec...|
|Bloodsworth, Davi...|[, [[, Issue:],,,,]]|
|Bloodsworth, Davi...|[, [[, Industry:]...|
|Brent, Daniel F. ...|[, [[, Arbitrator...|
|Brent, Daniel F. ...|[, [[, Profession...|
|Brent, Daniel F. ...|[, [[, Arbitratio...|
|Brent, Daniel F. ...|[, [[, Permanent ...|
|Brent, Daniel F. ...|[, [[, Issues:],,...|
|Brent, Daniel F. ...|[, [[, Industries...|
|Chiesa, Mario (Mi...|[, [[, Arbitrator...|
+--------------------+--------------------+

使用 AWS 胶水 sql 上下文,我想探索 paraar 列并希望从该列中提取数据。我对 AWS 胶水很陌生。我不知道该怎么做。

我已经执行了下面的代码来获取这个数据列。

from pyspark.sql.functions import *
dyf1.toDF().createOrReplaceTempView("x3")
df1=spark.sql("select p._VALUE,explode(para_array) as paraarr from x2")
4

0 回答 0