1

我在 Ubuntu 14.04 上有 Hadoop 2.7.0 和 Spark 2.0.0。我有一个主节点和两个从节点。所有守护进程都启动良好。当我在没有 Yarn 的情况下启动 spark-shell 时,以下运行正常

scala> val inputRDD = sc.textFile("/spark_examples/war_and_peace.txt")
inputRDD: org.apache.spark.rdd.RDD[String] = /spark_examples/war_and_peace.txt MapPartitionsRDD[1] at textFile at <console>:24

scala> inputRDD.collect
res0: Array[String] = Array(The Project Gutenberg EBook of War and Peace, by Leo Tolstoy, "", This eBook is for the use of anyone anywhere at no cost and with almost, no restrictions whatsoever.  You may copy it, give it away or re-use it, under the terms of the Project Gutenberg License included with this, eBook or online at www.gutenberg.org, "", "", Title: War and Peace, "", Author: Leo Tolstoy, "", Translators: Louise and Aylmer Maude, "", Posting Date: January 10, 2009 [EBook #2600], "", Last Updated: March 15, 2013, "", Language: English, "", Character set encoding: ASCII, "", *** START OF THIS PROJECT GUTENBERG EBOOK WAR AND PEACE ***, "", An Anonymous Volunteer, and David Widger, "", "", "", "", "", WAR AND PEACE, "", By Leo Tolstoy/Tolstoi, "", CONTENTS, "", BOOK ONE: 1805, "",...
scala> 

但是当我用 Yarn 启动 spark-shell 时,它会抛出以下错误

scala> val inputRDD = sc.textFile("/spark_examples/war_and_peace.txt")
inputRDD: org.apache.spark.rdd.RDD[String] = /spark_examples/war_and_peace.txt MapPartitionsRDD[1] at textFile at <console>:24

scala> inputRDD.collect
[Stage 0:>                                                          (0 + 2) / 2]17/04/03 21:31:04 ERROR shuffle.RetryingBlockFetcher: Exception while beginning fetch of 1 outstanding blocks 
java.io.IOException: Failed to connect to HadoopSlave2/192.168.78.136:44749
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228)
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179)
    at org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:96)
    at org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140)
    at org.apache.spark.network.shuffle.RetryingBlockFetcher.start(RetryingBlockFetcher.java:120)
    at org.apache.spark.network.netty.NettyBlockTransferService.fetchBlocks(NettyBlockTransferService.scala:105)
    at org.apache.spark.network.BlockTransferService.fetchBlockSync(BlockTransferService.scala:92)
    at org.apache.spark.storage.BlockManager.getRemoteBytes(BlockManager.scala:554)
    at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:76)
    at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:57)
    at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:57)
    at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1857)
    at org.apache.spark.scheduler.TaskResultGetter$$anon$2.run(TaskResultGetter.scala:56)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: HadoopSlave2/192.168.78.136:44749
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
    ... 1 more

我错过了配置中的任何内容吗?

4

1 回答 1

1

Spark 使用随机端口进行驱动程序和执行程序之间的内部通信,这可能会被您的防火墙阻止。尝试打开集群节点之间的端口。

如果您即使在集群内对防火墙规则也很严格,也可以使用固定端口,

val conf = new SparkConf() 
    .setMaster(master) 
    .setAppName("namexxx") 
    .set("spark.driver.port", "51810") 
    .set("spark.fileserver.port", "51811") 
    .set("spark.broadcast.port", "51812") 
    .set("spark.replClassServer.port", "51813") 
    .set("spark.blockManager.port", "51814") 
    .set("spark.executor.port", "51815")  

参考这里,https://stackoverflow.com/a/30036642/3080158

于 2017-04-03T12:24:41.190 回答