1

我尝试使用 Apache Mesos、Apache Aurora、ZooKeeper 和 HDFS 构建 Heron 集群。但是,当我在完成后提交 WordCountTopology 时,命令输出如下:停止“创建作业 WordCountTopology”。

yitian@ubuntu:~/.heron/conf/aurora$ heron submit aurora/yitian/devel --config-path ~/.heron/conf ~/.heron/examples/heron-api-examples.jar com.twitter.heron.examples.api.WordCountTopology WordCountTopology
[2018-02-13 06:58:30 +0000] [INFO]: Using cluster definition in /home/yitian/.heron/conf/aurora
[2018-02-13 06:58:30 +0000] [INFO]: Launching topology: 'WordCountTopology'
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/yitian/.heron/lib/uploader/heron-dlog-uploader.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/yitian/.heron/lib/statemgr/heron-zookeeper-statemgr.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory]
[2018-02-13 06:58:31 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Starting Curator client connecting to: heron01:2181  
[2018-02-13 06:58:31 -0800] [INFO] org.apache.curator.framework.imps.CuratorFrameworkImpl: Starting  
[2018-02-13 06:58:31 -0800] [INFO] org.apache.curator.framework.state.ConnectionStateManager: State change: CONNECTED  
[2018-02-13 06:58:31 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Directory tree initialized.  
[2018-02-13 06:58:31 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Checking existence of path: /home/yitian/heron/state/topologies/WordCountTopology  
[2018-02-13 06:58:34 -0800] [INFO] com.twitter.heron.uploader.hdfs.HdfsUploader: The destination directory does not exist. Creating it now at URI '/home/yitian/heron/topologies/aurora'  
[2018-02-13 06:58:37 -0800] [INFO] com.twitter.heron.uploader.hdfs.HdfsUploader: Uploading topology package at '/tmp/tmpvYzRv7/topology.tar.gz' to target HDFS at '/home/yitian/heron/topologies/aurora/WordCountTopology-yitian-tag-0--8268125700662472072.tar.gz'  
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /home/yitian/heron/state/topologies/WordCountTopology  
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /home/yitian/heron/state/packingplans/WordCountTopology  
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /home/yitian/heron/state/executionstate/WordCountTopology  
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.scheduler.aurora.AuroraLauncher: Launching topology in aurora  
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.scheduler.utils.SchedulerUtils: Updating scheduled-resource in packing plan: WordCountTopology  
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Deleted node for path: /home/yitian/heron/state/packingplans/WordCountTopology  
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /home/yitian/heron/state/packingplans/WordCountTopology  
INFO] Creating job WordCountTopology

苍鹭追踪器显示:

status  "success"
executiontime   0.00007081031799316406
message ""
version "0.17.1"
result  {}

Heron UI 什么也没显示: 在此处输入图像描述

Aurora 调度器运行为: 在此处输入图像描述

此外,它在集群中有两台主机。

  1. master 名为 heron01,运行 Mesos Master、zookeeper 和 Aurora Scheduler。
  2. slave 命名为 heron02,运行 Mesos slave、Aurora Observer 和 Executor。

我可以使用网站打开 Observer( heron02:1338) 和 Executor( heron02:5051)。我不知道我在哪里犯了错误。集群配置非常复杂,我无法在这里完全展示。您可以查看我的网站关于集群配置的信息。很抱歉我的网站是中文的,但我相信你能理解网站中的配置文件内容。博客在这里 非常感谢您的帮助。

4

1 回答 1

0

此问题是由于集群资源不足造成的。当 Aurora Scheduler 将实例调度到 Heron 集群中的 worker 节点时,如果某个 worker 节点没有足够的资源来分配实例,则会导致该实例处于挂起状态,等待集群中资源充足的工作节点出现。所以这个问题是通过增加Heron集群中worker节点的RAM资源来解决的。

于 2018-09-10T08:47:37.543 回答