我无法将表格保存到几个不同来源中的任何一个。
我尝试了以下方法:
dataset.toPandas().to_csv("local_path")dataset.createOrReplaceTempView("tempTable")
spark.sql("DROP TABLE IF EXISTS impala_table")
spark.sql((f"CREATE TABLE IF NOT EXISTS impala_table AS "
"SELECT * from tempTable"))dataset.write.overwrite().saveAsTable("impala_table")dataset.write.csv(file, header=True, mode="overwrite")
所以,我的推论是它甚至没有以任何形式写出来,但我不知道如何更多地了解它。
错误日志即使不相同,也非常相似。我发现最奇怪的问候 a module named "src" that is not found。这是我发现最重复和相关的内容:
/opt/cloudera/parcels/SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179/
lib/spark2/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py in
get_return_value(answer, gateway_client, target_id, name)
326 raise Py4JJavaError(
327 "An error occurred while calling {0}{1}{2}.\n".
--> 328 format(target_id, ".", name), value)
329 else:
330 raise
Py4JError( Py4JJavaError: An error occurred while calling o877.saveAsTable. :
org.apache.spark.SparkException: Job aborted. at
org.apache.spark.sql.execution.datasources.FileFormatWriter$.
write(FileFormatWriter.scala:224)
...
File "/opt/cloudera/parcels/SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179/
lib/spark2/python/pyspark/serializers.py", line 566,
in loads return pickle.loads(obj, encoding=encoding)
ModuleNotFoundError: No module named 'src'
感谢检查出来。
干杯。