1

我正在尝试在我的 cod 中合并 5 个数据帧(简单合并,无连接)。输出数据帧包含大约 95k 条记录。集群卡住了,既没有运行也没有失败。它是 4.4 倍的大型集群,有 40 个节点。 Spark 配置- --num-executors 120 --executor-cores 5 --executor-memory 38g --driver-memory 35g --shuffle.partitions=1400

当我手动执行火花提交时,它会抛出错误AsyncEventQueue: Dropping event from queue appStatus。这可能意味着其中一个侦听器速度太慢,无法跟上调度程序启动任务的速度。

我也得到了很多IndexOutOfBounds异常。这里要注意的重要一点是,没有固定的数据帧会从该数据帧开始发生此错误/异常。有人可以帮助解决可能的原因吗?

例外情况已粘贴在下面

javax.servlet.ServletException: java.lang.IndexOutOfBoundsException: 4
        at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:489)
        at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427)
        at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388)
        at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341)
        at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228)
        at org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
        at org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
        at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:166)
        at org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
        at org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
        at org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
        at org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
        at org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
        at org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
        at org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493)
        at org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
        at org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
        at org.spark_project.jetty.server.Server.handle(Server.java:539)
        at org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:333)
        at org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
        at org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
        at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:108)
        at org.spark_project.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
        at org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
        at org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
        at org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
        at org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
        at org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
        at java.lang.Thread.run(Thread.java:748)
4

0 回答 0