如果我启动 pyspark 然后运行这个命令:
import my_script; spark = my_script.Sparker(sc); spark.collapse('./data/')
一切都很好。但是,如果我尝试通过命令行和 spark-submit 执行相同的操作,则会收到错误消息:
Command: /usr/local/spark/bin/spark-submit my_script.py collapse ./data/
File "/usr/local/spark/python/pyspark/rdd.py", line 352, in func
return f(iterator)
File "/usr/local/spark/python/pyspark/rdd.py", line 1576, in combineLocally
merger.mergeValues(iterator)
File "/usr/local/spark/python/pyspark/shuffle.py", line 245, in mergeValues
for k, v in iterator:
File "/.../my_script.py", line 173, in _json_args_to_arr
js = cls._json(line)
RuntimeError: uninitialized staticmethod object
我的脚本:
...
if __name__ == "__main__":
args = sys.argv[1:]
if args[0] == 'collapse':
directory = args[1]
from pyspark import SparkContext
sc = SparkContext(appName="Collapse")
spark = Sparker(sc)
spark.collapse(directory)
sc.stop()
为什么会这样?运行 pyspark 和运行 spark-submit 会导致这种分歧有什么区别?我怎样才能在 spark-submit 中完成这项工作?
编辑:我尝试从 bash shell 运行它,但pyspark my_script.py collapse ./data/
我得到了同样的错误。唯一一切正常的时候是我在 python shell 中并导入脚本。