1

我在 Hadoop 2.7 集群上安装了 Spark 1.4.1。

  1. 我已经启动了 SparkR shell,没有错误:

    bin/sparkR --master yarn-client
    
  2. 我运行 R 命令没有错误(来自 spark.apache.org 的介绍性示例):

    df <- createDataFrame(sqlContext, faithful)
    
  3. 当我运行命令时:

    head(select(df, df$eruptions))
    

我在 15/09/02 10:08:29 在执行程序节点上收到以下错误:

“Rscript 执行错误:没有这样的文件或目录”。

任何提示将不胜感激。SparkR 以外的 Spark 任务在我的 YARN 集群上运行正常。R 3.2.1 已安装并在驱动程序节点上运行正常。

15/09/02 10:04:06 INFO executor.CoarseGrainedExecutorBackend: Registered signal handlers for [TERM, HUP, INT]
15/09/02 10:04:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/09/02 10:04:10 INFO spark.SecurityManager: Changing view acls to: yarn,root
15/09/02 10:04:10 INFO spark.SecurityManager: Changing modify acls to: yarn,root
15/09/02 10:04:10 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, root); users with modify permissions: Set(yarn, root)
15/09/02 10:04:11 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/09/02 10:04:12 INFO Remoting: Starting remoting
15/09/02 10:04:12 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://driverPropsFetcher@datanode1.hp.com:46167]
15/09/02 10:04:12 INFO util.Utils: Successfully started service 'driverPropsFetcher' on port 46167.
15/09/02 10:04:12 INFO spark.SecurityManager: Changing view acls to: yarn,root
15/09/02 10:04:12 INFO spark.SecurityManager: Changing modify acls to: yarn,root
15/09/02 10:04:12 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, root); users with modify permissions: Set(yarn, root)
15/09/02 10:04:12 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/09/02 10:04:12 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/09/02 10:04:12 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
15/09/02 10:04:12 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/09/02 10:04:12 INFO Remoting: Starting remoting
15/09/02 10:04:13 INFO util.Utils: Successfully started service 'sparkExecutor' on port 47919.
15/09/02 10:04:13 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkExecutor@datanode1.hp.com:47919]
15/09/02 10:04:13 INFO storage.DiskBlockManager: Created local directory at /data2/hadoop/yarn/local/usercache/root/appcache/application_1441180800595_0001/blockmgr-5e435e40-bd36-4746-9acd-8cf1619033ae
15/09/02 10:04:13 INFO storage.DiskBlockManager: Created local directory at /data3/hadoop/yarn/local/usercache/root/appcache/application_1441180800595_0001/blockmgr-28dfabe6-8e0d-4e49-bc95-27b3428c10a0
15/09/02 10:04:13 INFO storage.MemoryStore: MemoryStore started with capacity 534.5 MB
15/09/02 10:04:13 INFO executor.CoarseGrainedExecutorBackend: Connecting to driver: akka.tcp://sparkDriver@192.1.1.1:45596/user/CoarseGrainedScheduler
15/09/02 10:04:13 INFO executor.CoarseGrainedExecutorBackend: Successfully registered with driver
15/09/02 10:04:13 INFO executor.Executor: Starting executor ID 2 on host datanode1.hp.com
15/09/02 10:04:14 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 34166.
15/09/02 10:04:14 INFO netty.NettyBlockTransferService: Server created on 34166
15/09/02 10:04:14 INFO storage.BlockManagerMaster: Trying to register BlockManager
15/09/02 10:04:14 INFO storage.BlockManagerMaster: Registered BlockManager
15/09/02 10:06:35 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 0
15/09/02 10:06:35 INFO executor.Executor: Running task 0.0 in stage 0.0 (TID 0)
15/09/02 10:06:35 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 0
15/09/02 10:06:35 INFO storage.MemoryStore: ensureFreeSpace(854) called with curMem=0, maxMem=560497950
15/09/02 10:06:35 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 854.0 B, free 534.5 MB)
15/09/02 10:06:35 INFO broadcast.TorrentBroadcast: Reading broadcast variable 0 took 159 ms
15/09/02 10:06:35 INFO storage.MemoryStore: ensureFreeSpace(1280) called with curMem=854, maxMem=560497950
15/09/02 10:06:35 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1280.0 B, free 534.5 MB)
15/09/02 10:06:35 INFO executor.Executor: Finished task 0.0 in stage 0.0 (TID 0). 11589 bytes result sent to driver
15/09/02 10:08:28 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 1
15/09/02 10:08:28 INFO executor.Executor: Running task 0.0 in stage 1.0 (TID 1)
15/09/02 10:08:28 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 1
15/09/02 10:08:28 INFO storage.MemoryStore: ensureFreeSpace(4022) called with curMem=0, maxMem=560497950
15/09/02 10:08:28 INFO storage.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 3.9 KB, free 534.5 MB)
15/09/02 10:08:28 INFO broadcast.TorrentBroadcast: Reading broadcast variable 1 took 13 ms
15/09/02 10:08:28 INFO storage.MemoryStore: ensureFreeSpace(9536) called with curMem=4022, maxMem=560497950
15/09/02 10:08:28 INFO storage.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 9.3 KB, free 534.5 MB)
15/09/02 10:08:29 INFO r.BufferedStreamThread: Rscript execution error: No such file or directory
15/09/02 10:08:39 ERROR executor.Executor: Exception in task 0.0 in stage 1.0 (TID 1)
java.net.SocketTimeoutException: Accept timed out
    at java.net.PlainSocketImpl.socketAccept(Native Method)
    at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
    at java.net.ServerSocket.implAccept(ServerSocket.java:530)
    at java.net.ServerSocket.accept(ServerSocket.java:498)
    at org.apache.spark.api.r.RRDD$.createRWorker(RRDD.scala:425)
    at org.apache.spark.api.r.BaseRRDD.compute(RRDD.scala:63)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
    at org.apache.spark.scheduler.Task.run(Task.scala:70)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
4

0 回答 0