我们正在使用具有 8 个内核和 32GB 内存的 Spark 独立集群,以及具有相同配置的 3 个节点集群。
有时流式批处理在不到 1 秒的时间内完成。有时需要超过 10 秒的时间,下面的日志会出现在控制台中。
2016-03-29 11:35:25,044 INFO TaskSchedulerImpl:59 - Removed TaskSet 18.0, whose tasks have all completed, from pool
2016-03-29 11:35:25,044 INFO DAGScheduler:59 - Job 18 finished: foreachRDD at EventProcessor.java:87, took 1.128755 s
2016-03-29 11:35:31,471 INFO JobScheduler:59 - Added jobs for time 1459231530000 ms
2016-03-29 11:35:35,004 INFO JobScheduler:59 - Added jobs for time 1459231535000 ms
2016-03-29 11:35:40,004 INFO JobScheduler:59 - Added jobs for time 1459231540000 ms
2016-03-29 11:35:45,136 INFO JobScheduler:59 - Added jobs for time 1459231545000 ms
2016-03-29 11:35:50,011 INFO JobScheduler:59 - Added jobs for time 1459231550000 ms
2016-03-29 11:35:55,004 INFO JobScheduler:59 - Added jobs for time 1459231555000 ms
2016-03-29 11:36:00,014 INFO JobScheduler:59 - Added jobs for time 1459231560000 ms
2016-03-29 11:36:05,003 INFO JobScheduler:59 - Added jobs for time 1459231565000 ms
2016-03-29 11:36:10,087 INFO JobScheduler:59 - Added jobs for time 1459231570000 ms
2016-03-29 11:36:15,004 INFO JobScheduler:59 - Added jobs for time 1459231575000 ms
2016-03-29 11:36:20,004 INFO JobScheduler:59 - Added jobs for time 1459231580000 ms
2016-03-29 11:36:25,139 INFO JobScheduler:59 - Added jobs for time 1459231585000 ms
你能帮忙吗,如何解决这个问题。