1

我想将 Kafka + Cassandra 连接到 Spark 1.5.1。

库的版本:

scalaVersion := "2.10.6"

libraryDependencies ++= Seq(
  "org.apache.spark" % "spark-streaming_2.10" % "1.5.1",
  "org.apache.spark" % "spark-streaming-kafka_2.10" % "1.5.1",
  "com.datastax.spark" % "spark-cassandra-connector_2.10" % "1.5.0-M2"
)

app的初始化和使用:

   val sparkConf = new SparkConf(true)
      .setMaster("local[2]")
      .setAppName("KafkaStreamToCassandraApp")
      .set("spark.executor.memory", "1g")
      .set("spark.cores.max", "1")
      .set("spark.cassandra.connection.host", "127.0.0.1")

像这样在 Cassandra 中创建模式:

  CassandraConnector(sparkConf).withSessionDo { session =>
      session.execute(s"DROP KEYSPACE IF EXISTS kafka_streaming")
      session.execute(s"CREATE KEYSPACE IF NOT EXISTS kafka_streaming WITH REPLICATION = {'class': 'SimpleStrategy', 'replication_factor': 1 }")
      session.execute(s"CREATE TABLE IF NOT EXISTS kafka_streaming.wordcount (word TEXT PRIMARY KEY, count COUNTER)")
      session.execute(s"TRUNCATE kafka_streaming.wordcount")
    }

同样在准备好的时候jar,制定一些策略:

assemblyMergeStrategy in assembly := {
  case PathList("com", "esotericsoftware", xs@_*) => MergeStrategy.last
  case PathList("com", "google", xs@_*) => MergeStrategy.first
  case PathList("org", "apache", xs@_*) => MergeStrategy.last
  case PathList("io", "netty", xs@_*) => MergeStrategy.last
  case PathList("com", "codahale", xs@_*) => MergeStrategy.last
  case PathList("META-INF", "io.netty.versions.properties") => MergeStrategy.first

我认为这个问题与

  case PathList("com", "google", xs@_*) => MergeStrategy.first

捆绑使用MergeStrategy.last

有任何想法吗?

有异常:

Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.reflect.TypeToken.isPrimitive()Z
        at com.datastax.driver.core.TypeCodec.<init>(TypeCodec.java:142)
        at com.datastax.driver.core.TypeCodec.<init>(TypeCodec.java:136)
        at com.datastax.driver.core.TypeCodec$BlobCodec.<init>(TypeCodec.java:609)
        at com.datastax.driver.core.TypeCodec$BlobCodec.<clinit>(TypeCodec.java:606)
        at com.datastax.driver.core.CodecRegistry.<clinit>(CodecRegistry.java:147)
        at com.datastax.driver.core.Configuration$Builder.build(Configuration.java:259)
        at com.datastax.driver.core.Cluster$Builder.getConfiguration(Cluster.java:1135)
        at com.datastax.driver.core.Cluster.<init>(Cluster.java:111)
        at com.datastax.driver.core.Cluster.buildFrom(Cluster.java:178)
        at com.datastax.driver.core.Cluster$Builder.build(Cluster.java:1152)
        at com.datastax.spark.connector.cql.DefaultConnectionFactory$.createCluster(CassandraConnectionFactory.scala:85)
        at com.datastax.spark.connector.cql.CassandraConnector$.com$datastax$spark$connector$cql$CassandraConnector$$createSession(CassandraConnector.scala:155)
4

1 回答 1

0

基于错误

 [error] /home/user/.ivy2/cache/org.apache.spark/spark-network-common_2.10/jars/spark-network-common_2.10-1.5.0.jar:com/google/common/base/Optional.class
 [error] /home/user/.ivy2/cache/com.google.guava/guava/bundles/guava-16.0.1.jar:com/google/common/base/Optional.class

似乎最后一个是最新的,也许你可以放:

case PathList("com", "google", "common", "base", xs@_*) => MergeStrategy.last
于 2015-11-07T20:44:32.033 回答