我正在尝试在独立集群模式下使用 Spark 版本 2.1.1 在火花流中运行 KafkaWordCount 示例。因为我试图集成的服务器上的 kafka 版本是 2.11-0.10.0.1 。根据https://spark.apache.org/docs/latest/streaming-kafka-integration.html有两个单独的包,一个用于 0.8.2.1 或更高版本,另一个用于 0.10.0 或更高版本。
我在 spark home 的 jars 文件夹中添加了以下 jar:
kafka_2.11-0.10.0.1.jar
spark-streaming-kafka-0-10-assembly_2.11-2.1.1.jar
spark-streaming-kafka-0-10_2.11-2.1.1.jar
运行此命令:
/usr/local/spark/bin/spark-submit --num-executors 1 --executor-memory 20G --total-executor-cores 4 --class org.apache.spark.examples.streaming.KafkaWordCount /usr/local/spark/examples/jars/spark-examples_2.11-2.1.1.jar 10.0.16.96:2181 group_test topic 6
在线程“主”java.lang.NoClassDefFoundError 中显示异常:org/apache/spark/streaming/kafka/KafkaUtils$
还有其他我错过的罐子吗?
日志:
/usr/local/spark/bin/spark-submit --num-executors 1 --executor-memory 20G --total-executor-cores 4 --class org.apache.spark.examples.streaming.KafkaWordCount /usr/local/spark/examples/jars/spark-examples_2.11-2.1.1.jar 10.0.16.96:2181 group_test streams 6
Warning: Ignoring non-spark config property: fs.s3.awsAccessKeyId=AKIAIETFDAABYC23XVSQ
Warning: Ignoring non-spark config property: fs.s3.awsSecretAccessKey=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
17/07/11 08:04:31 INFO spark.SparkContext: Running Spark version 2.1.1
17/07/11 08:04:31 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/07/11 08:04:31 INFO spark.SecurityManager: Changing view acls to: mahendra
17/07/11 08:04:31 INFO spark.SecurityManager: Changing modify acls to: mahendra
17/07/11 08:04:31 INFO spark.SecurityManager: Changing view acls groups to:
17/07/11 08:04:31 INFO spark.SecurityManager: Changing modify acls groups to:
17/07/11 08:04:31 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(mahendra); groups with view permissions: Set(); users with modify permissions: Set(mahendra); groups with modify permissions: Set()
17/07/11 08:04:32 INFO util.Utils: Successfully started service 'sparkDriver' on port 38173.
17/07/11 08:04:32 INFO spark.SparkEnv: Registering MapOutputTracker
17/07/11 08:04:32 INFO spark.SparkEnv: Registering BlockManagerMaster
17/07/11 08:04:32 INFO storage.BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
17/07/11 08:04:32 INFO storage.BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
17/07/11 08:04:32 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-241eda29-1cb3-4364-859c-79ba86689fbf
17/07/11 08:04:32 INFO memory.MemoryStore: MemoryStore started with capacity 5.2 GB
17/07/11 08:04:32 INFO spark.SparkEnv: Registering OutputCommitCoordinator
17/07/11 08:04:32 INFO util.log: Logging initialized @1581ms
17/07/11 08:04:32 INFO server.Server: jetty-9.2.z-SNAPSHOT
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@a7e2d9d{/jobs,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@754777cd{/jobs/json,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2b52c0d6{/jobs/job,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@372ea2bc{/jobs/job/json,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4cc76301{/stages,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2f08c4b{/stages/json,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3f19b8b3{/stages/stage,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7de0c6ae{/stages/stage/json,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@a486d78{/stages/pool,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@cdc3aae{/stages/pool/json,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7ef2d7a6{/storage,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5dcbb60{/storage/json,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4c36250e{/storage/rdd,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@21526f6c{/storage/rdd/json,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@49f5c307{/environment,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@299266e2{/environment/json,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5471388b{/executors,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@66ea1466{/executors/json,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1601e47{/executors/threadDump,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3bffddff{/executors/threadDump/json,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@66971f6b{/static,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@50687efb{/,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@517bd097{/api,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@142eef62{/jobs/job/kill,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4a9cc6cb{/stages/stage/kill,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO server.ServerConnector: Started Spark@6de54b40{HTTP/1.1}{0.0.0.0:4040}
17/07/11 08:04:32 INFO server.Server: Started @1696ms
17/07/11 08:04:32 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
17/07/11 08:04:32 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.0.16.15:4040
17/07/11 08:04:32 INFO spark.SparkContext: Added JAR file:/usr/local/spark/examples/jars/spark-examples_2.11-2.1.1.jar at spark://10.0.16.15:38173/jars/spark-examples_2.11-2.1.1.jar with timestamp 1499760272476
17/07/11 08:04:32 INFO client.StandaloneAppClient$ClientEndpoint: Connecting to master spark://ip-10-0-16-15.ap-southeast-1.compute.internal:7077...
17/07/11 08:04:32 INFO client.TransportClientFactory: Successfully created connection to ip-10-0-16-15.ap-southeast-1.compute.internal/10.0.16.15:7077 after 27 ms (0 ms spent in bootstraps)
17/07/11 08:04:32 INFO cluster.StandaloneSchedulerBackend: Connected to Spark cluster with app ID app-20170711080432-0038
17/07/11 08:04:32 INFO client.StandaloneAppClient$ClientEndpoint: Executor added: app-20170711080432-0038/0 on worker-20170707101056-10.0.16.51-40051 (10.0.16.51:40051) with 4 cores
17/07/11 08:04:32 INFO cluster.StandaloneSchedulerBackend: Granted executor ID app-20170711080432-0038/0 on hostPort 10.0.16.51:40051 with 4 cores, 20.0 GB RAM
17/07/11 08:04:32 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 35723.
17/07/11 08:04:32 INFO netty.NettyBlockTransferService: Server created on 10.0.16.15:35723
17/07/11 08:04:32 INFO storage.BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
17/07/11 08:04:32 INFO storage.BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 10.0.16.15, 35723, None)
17/07/11 08:04:32 INFO client.StandaloneAppClient$ClientEndpoint: Executor updated: app-20170711080432-0038/0 is now RUNNING
17/07/11 08:04:32 INFO storage.BlockManagerMasterEndpoint: Registering block manager 10.0.16.15:35723 with 5.2 GB RAM, BlockManagerId(driver, 10.0.16.15, 35723, None)
17/07/11 08:04:32 INFO storage.BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.0.16.15, 35723, None)
17/07/11 08:04:32 INFO storage.BlockManager: Initialized BlockManager: BlockManagerId(driver, 10.0.16.15, 35723, None)
17/07/11 08:04:32 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@34448e6c{/metrics/json,null,AVAILABLE,@Spark}
17/07/11 08:04:32 INFO cluster.StandaloneSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
17/07/11 08:04:33 WARN fs.FileSystem: Cannot load filesystem
java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider org.apache.hadoop.fs.s3a.S3AFileSystem could not be instantiated
at java.util.ServiceLoader.fail(ServiceLoader.java:232)
at java.util.ServiceLoader.access$100(ServiceLoader.java:185)
at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384)
at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2631)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2650)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:172)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:357)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
at org.apache.spark.streaming.StreamingContext.checkpoint(StreamingContext.scala:238)
at org.apache.spark.examples.streaming.KafkaWordCount$.main(KafkaWordCount.scala:54)
at org.apache.spark.examples.streaming.KafkaWordCount.main(KafkaWordCount.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:743)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/fs/StorageStatistics
at java.lang.Class.getDeclaredConstructors0(Native Method)
at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671)
at java.lang.Class.getConstructor0(Class.java:3075)
at java.lang.Class.newInstance(Class.java:412)
at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380)
... 24 more
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.StorageStatistics
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 29 more
17/07/11 08:04:33 WARN spark.SparkContext: Spark is not running in local mode, therefore the checkpoint directory must not be on the local filesystem. Directory 'file:/home/mahendra/checkpoint' appears to be on the local filesystem.
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/streaming/kafka/KafkaUtils$
at org.apache.spark.examples.streaming.KafkaWordCount$.main(KafkaWordCount.scala:57)
at org.apache.spark.examples.streaming.KafkaWordCount.main(KafkaWordCount.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:743)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.streaming.kafka.KafkaUtils$
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 11 more
17/07/11 08:04:33 INFO spark.SparkContext: Invoking stop() from shutdown hook
17/07/11 08:04:33 INFO server.ServerConnector: Stopped Spark@6de54b40{HTTP/1.1}{0.0.0.0:4040}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@4a9cc6cb{/stages/stage/kill,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@142eef62{/jobs/job/kill,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@517bd097{/api,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@50687efb{/,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@66971f6b{/static,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@3bffddff{/executors/threadDump/json,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@1601e47{/executors/threadDump,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@66ea1466{/executors/json,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@5471388b{/executors,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@299266e2{/environment/json,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@49f5c307{/environment,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@21526f6c{/storage/rdd/json,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@4c36250e{/storage/rdd,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@5dcbb60{/storage/json,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@7ef2d7a6{/storage,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@cdc3aae{/stages/pool/json,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@a486d78{/stages/pool,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@7de0c6ae{/stages/stage/json,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@3f19b8b3{/stages/stage,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@2f08c4b{/stages/json,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@4cc76301{/stages,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@372ea2bc{/jobs/job/json,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@2b52c0d6{/jobs/job,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@754777cd{/jobs/json,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@a7e2d9d{/jobs,null,UNAVAILABLE,@Spark}
17/07/11 08:04:33 INFO ui.SparkUI: Stopped Spark web UI at http://10.0.16.15:4040
17/07/11 08:04:33 INFO cluster.StandaloneSchedulerBackend: Shutting down all executors
17/07/11 08:04:33 INFO cluster.CoarseGrainedSchedulerBackend$DriverEndpoint: Asking each executor to shut down
17/07/11 08:04:33 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
17/07/11 08:04:33 INFO memory.MemoryStore: MemoryStore cleared
17/07/11 08:04:33 INFO storage.BlockManager: BlockManager stopped
17/07/11 08:04:33 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
17/07/11 08:04:33 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
17/07/11 08:04:33 INFO spark.SparkContext: Successfully stopped SparkContext
17/07/11 08:04:33 INFO util.ShutdownHookManager: Shutdown hook called
17/07/11 08:04:33 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-a7875c5c-cdfc-486e-bf7d-7fe0a7cff228
谢谢 !