0

我在编写流数据时正在学习结构化流,当时显示此错误

    CountQuery: org.apache.spark.sql.streaming.StreamingQuery = org.apache.spark.sql.execution.streaming.StreamingQueryWrapper@604770e3
org.apache.spark.sql.streaming.StreamingQueryException: Query Count [id = 4ce8572a-24c9-4cde-97e4-051426cbb15e, runId = 59c60d53-73ee-43a4-8792-d5907a888de5] terminated with exception: Job aborted due to stage failure: Task 0 in stage 183.0 failed 4 times, most recent failure: Lost task 0.3 in stage 183.0 (TID 5072, 172.31.21.3, executor 1): org.apache.spark.util.TaskCompletionListenerException: Mkdirs failed to create /tmp/temporary-cf1c3598-8273-4cec-a54a-d6eca9d7d08f/state/0/0 (exists=false, cwd=file:path/app-20170712063557-0003/1)
    at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:105)
    at org.apache.spark.scheduler.Task.run(Task.scala:112)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)

我完全允许该目录。

4

1 回答 1

1

我认为您遇到了这个问题:https ://issues.apache.org/jira/browse/SPARK-19909

当您不指定检查点位置时,Spark 将创建一个临时目录。该目录将位于默认文件系统上(在您的情况下,它可能是 HDFS)。

所以这里有两个选择:

  1. 授予对正确文件系统上目录的完全权限。
  2. 始终设置checkpointLocation
于 2017-07-13T17:58:33.627 回答