我正在使用 Kinesis 设置 Spark Streaming 项目,当我尝试连接到我的 Kinesis 流时,我从 Spark 收到以下错误:
ERROR ShardSyncTask: Caught exception while sync'ing Kinesis shards and leases
com.amazonaws.services.kinesis.clientlibrary.exceptions.internal.KinesisClientLibIOException: Parent shard shardId-000000000000 exists but not the child shard shardId-000000000002
当我将测试数据发布到此流或使用基本 Amazon 库从流中读取数据时,我没有收到任何错误,这仅在我尝试连接 Spark 时发生。
以下是我用于测试的代码:
val conf = new SparkConf().setMaster("local[2]").setAppName("KinesisCounter")
val ssc = new StreamingContext(conf, Seconds(1))
val rawStream = KinesisUtils.createStream(ssc, "dev-test", "kinesis.us-east-1.amazonaws.com", Duration(1000), InitialPositionInStream.TRIM_HORIZON, StorageLevel.MEMORY_ONLY)
rawStream.map(msg => new String(msg)).count.print