我在 Ubuntu 14.04 上有 Hadoop 2.7.0 和 Spark 2.0.0。我有一个主节点和两个从节点。所有守护进程都启动良好。当我在没有 Yarn 的情况下启动 spark-shell 时,以下运行正常
scala> val inputRDD = sc.textFile("/spark_examples/war_and_peace.txt")
inputRDD: org.apache.spark.rdd.RDD[String] = /spark_examples/war_and_peace.txt MapPartitionsRDD[1] at textFile at <console>:24
scala> inputRDD.collect
res0: Array[String] = Array(The Project Gutenberg EBook of War and Peace, by Leo Tolstoy, "", This eBook is for the use of anyone anywhere at no cost and with almost, no restrictions whatsoever. You may copy it, give it away or re-use it, under the terms of the Project Gutenberg License included with this, eBook or online at www.gutenberg.org, "", "", Title: War and Peace, "", Author: Leo Tolstoy, "", Translators: Louise and Aylmer Maude, "", Posting Date: January 10, 2009 [EBook #2600], "", Last Updated: March 15, 2013, "", Language: English, "", Character set encoding: ASCII, "", *** START OF THIS PROJECT GUTENBERG EBOOK WAR AND PEACE ***, "", An Anonymous Volunteer, and David Widger, "", "", "", "", "", WAR AND PEACE, "", By Leo Tolstoy/Tolstoi, "", CONTENTS, "", BOOK ONE: 1805, "",...
scala>
但是当我用 Yarn 启动 spark-shell 时,它会抛出以下错误
scala> val inputRDD = sc.textFile("/spark_examples/war_and_peace.txt")
inputRDD: org.apache.spark.rdd.RDD[String] = /spark_examples/war_and_peace.txt MapPartitionsRDD[1] at textFile at <console>:24
scala> inputRDD.collect
[Stage 0:> (0 + 2) / 2]17/04/03 21:31:04 ERROR shuffle.RetryingBlockFetcher: Exception while beginning fetch of 1 outstanding blocks
java.io.IOException: Failed to connect to HadoopSlave2/192.168.78.136:44749
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179)
at org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:96)
at org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140)
at org.apache.spark.network.shuffle.RetryingBlockFetcher.start(RetryingBlockFetcher.java:120)
at org.apache.spark.network.netty.NettyBlockTransferService.fetchBlocks(NettyBlockTransferService.scala:105)
at org.apache.spark.network.BlockTransferService.fetchBlockSync(BlockTransferService.scala:92)
at org.apache.spark.storage.BlockManager.getRemoteBytes(BlockManager.scala:554)
at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:76)
at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:57)
at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:57)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1857)
at org.apache.spark.scheduler.TaskResultGetter$$anon$2.run(TaskResultGetter.scala:56)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: HadoopSlave2/192.168.78.136:44749
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
... 1 more
我错过了配置中的任何内容吗?