1

我正在使用MongoDB Kafka 连接器的 beta 版本从 MongoDB 发布到 Kafka 主题。

消息在 Kafka 中生成,但当它应该是文档 ID 时,它们的键为空:

在此处输入图像描述

这是我的连接独立配置

bootstrap.servers=xxx:9092

# The converters specify the format of data in Kafka and how to translate it into Connect data. Every Connect user will
# need to configure these based on the format they want their data in when loaded from or stored into Kafka
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
# Converter-specific settings can be passed in by prefixing the Converter's setting with the converter you want to apply
# it to
key.converter.schemas.enable=false
value.converter.schemas.enable=false

# The internal converter used for offsets and config data is configurable and must be specified, but most users will
# always want to use the built-in default. Offset and config data is never visible outside of Kafka Connect in this format.
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false

mongodb 源属性

name=mongo-source
connector.class=com.mongodb.kafka.connect.MongoSourceConnector
tasks.max=1

# Connection and source configuration
connection.uri=mongodb+srv://xxx
database=mydb
collection=mycollection

topic.prefix=someprefix
poll.max.batch.size=1000
poll.await.time.ms=5000

# Change stream options
pipeline=[]
batch.size=0
change.stream.full.document=updateLookup
collation=

下面是一个消息字符串值的示例:

"{\"_id\": {\"_data\": \"xxx\"}, \"operationType\": \"replace\", \"clusterTime\": {\"$timestamp\": {\"t\": 1564140389, \"i\": 1}}, \"fullDocument\": {\"_id\": \"5\", \"name\": \"Some Client\", \"clientId\": \"someclient\", \"clientSecret\": \"1234\", \"whiteListedIps\": [], \"enabled\": true, \"_class\": \"myproject.Client\"}, \"ns\": {\"db\": \"mydb\", \"coll\": \"mycollection\"}, \"documentKey\": {\"_id\": \"5\"}}"

我尝试使用转换从值中提取 if,特别是从 documentKey 字段中提取:

transforms=InsertKey
transforms.InsertKey.type=org.apache.kafka.connect.transforms.ValueToKey
transforms.InsertKey.fields=documentKey

但有一个例外:

Caused by: org.apache.kafka.connect.errors.DataException: Only Struct objects supported for [copying fields from value to key], found: java.lang.String
    at org.apache.kafka.connect.transforms.util.Requirements.requireStruct(Requirements.java:52)
    at org.apache.kafka.connect.transforms.ValueToKey.applyWithSchema(ValueToKey.java:79)
    at org.apache.kafka.connect.transforms.ValueToKey.apply(ValueToKey.java:65)

有什么想法可以生成带有文档 ID 的密钥吗?

4

2 回答 2

2

根据异常,抛出:

Caused by: org.apache.kafka.connect.errors.DataException: Only Struct objects supported for [copying fields from value to key], found: java.lang.String
    at org.apache.kafka.connect.transforms.util.Requirements.requireStruct(Requirements.java:52)
    at org.apache.kafka.connect.transforms.ValueToKey.applyWithSchema(ValueToKey.java:79)
    at org.apache.kafka.connect.transforms.ValueToKey.apply(ValueToKey.java:65)

不幸的是,您使用的Mongo DB connector没有正确创建 schema

上面的连接器使用键和值模式创建记录 as String。检查这一行::连接器如何创建记录。这就是您不能对其应用转换的原因

于 2019-08-02T10:20:00.587 回答
0

这应该在 1.3.0 版中得到支持: https ://jira.mongodb.org/browse/KAFKA-40

于 2020-07-13T10:24:21.440 回答