我正在评估 Confluent Kafka S2 Source Connector 并遇到以下堆栈跟踪的问题:
[2020-12-22 15:27:41,636] ERROR WorkerConnector{id=s3-source-connector} Error while starting connector (org.apache.kafka.connect.runtime.WorkerConnect
or)
org.apache.kafka.connect.errors.ConnectException: Failed to get list of folders from S3 bucket - kafka-connect for key path - topics/ and delimiter - /
at io.confluent.connect.s3.source.S3Storage.listFolders(S3Storage.java:286)
at io.confluent.connect.s3.source.S3Storage.getPartitions(S3Storage.java:98)
at io.confluent.connect.storage.partitioner.TimeBasedPartitioner.getPartitions(TimeBasedPartitioner.java:50)
at io.confluent.connect.cloud.storage.source.StorageSourceConnector.doStart(StorageSourceConnector.java:77)
at io.confluent.connect.cloud.storage.source.StorageSourceConnector.start(StorageSourceConnector.java:69)
at org.apache.kafka.connect.runtime.WorkerConnector.doStart(WorkerConnector.java:111)
at org.apache.kafka.connect.runtime.WorkerConnector.start(WorkerConnector.java:136)
at org.apache.kafka.connect.runtime.WorkerConnector.transitionTo(WorkerConnector.java:196)
at org.apache.kafka.connect.runtime.Worker.startConnector(Worker.java:242)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder.startConnector(DistributedHerder.java:908)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1300(DistributedHerder.java:110)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder$15.call(DistributedHerder.java:924)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder$15.call(DistributedHerder.java:920)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.amazonaws.SdkClientException: Couldn't initialize a SAX driver to create an XMLReader
at com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.<init>(XmlResponsesSaxParser.java:123)
at com.amazonaws.services.s3.model.transform.Unmarshallers$ListObjectsV2Unmarshaller.unmarshall(Unmarshallers.java:127)
at com.amazonaws.services.s3.model.transform.Unmarshallers$ListObjectsV2Unmarshaller.unmarshall(Unmarshallers.java:117)
at com.amazonaws.services.s3.internal.S3XmlResponseHandler.handle(S3XmlResponseHandler.java:62)
at com.amazonaws.services.s3.internal.S3XmlResponseHandler.handle(S3XmlResponseHandler.java:31)
at com.amazonaws.http.response.AwsResponseHandlerAdapter.handle(AwsResponseHandlerAdapter.java:69)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleResponse(AmazonHttpClient.java:1714)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleSuccessResponse(AmazonHttpClient.java:1434)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1356)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1139)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:796)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:764)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:738)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:698)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:680)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:544)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:524)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5052)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4998)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4992)
at com.amazonaws.services.s3.AmazonS3Client.listObjectsV2(AmazonS3Client.java:938)
at io.confluent.connect.s3.source.S3Storage.listFolders(S3Storage.java:283)
... 16 more
Caused by: org.xml.sax.SAXException: SAX2 driver class org.apache.xerces.parsers.SAXParser not found
java.lang.ClassNotFoundException: org.apache.xerces.parsers.SAXParser
at org.xml.sax.helpers.XMLReaderFactory.loadClass(XMLReaderFactory.java:230)
at org.xml.sax.helpers.XMLReaderFactory.createXMLReader(XMLReaderFactory.java:191)
at com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.<init>(XmlResponsesSaxParser.java:120)
... 37 more
Caused by: java.lang.ClassNotFoundException: org.apache.xerces.parsers.SAXParser
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at org.apache.kafka.connect.runtime.isolation.PluginClassLoader.loadClass(PluginClassLoader.java:104)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.xml.sax.helpers.NewInstance.newInstance(NewInstance.java:82)
at org.xml.sax.helpers.XMLReaderFactory.loadClass(XMLReaderFactory.java:228)
... 39 more
连接器配置:
{
"name": "source-connector",
"config": {
"connector.class":"io.confluent.connect.s3.source.S3SourceConnector",
"s3.bucket.name":"bucket-test",
"s3.region":"us-west-2",
"tasks.max":"1",
"topics":"migration-topic",
"topics.dir":"topics/events",
"format.class":"io.confluent.connect.s3.format.json.JsonFormat",
"behavior.on.error": "log",
"partitioner.class":"io.confluent.connect.storage.partitioner.TimeBasedPartitioner",
"path.format":"'date'=YYYY-MM-dd/'hour'=HH",
"key.converter":"com.pandadoc.kafka.connect.msgpack.converter.MessagePackConverter",
"key.converter.schemas.enable":"false",
"value.converter":"com.pandadoc.kafka.connect.msgpack.converter.MessagePackConverter",
"value.converter.schemas.enable":"false",
"errors.tolerance": "all",
"errors.deadletterqueue.topic.name": "kafka-connect-dead-letter-queue",
"errors.deadletterqueue.context.headers.enable": true,
"confluent.license":"",
"confluent.topic.bootstrap.servers":"localhost:9092",
"confluent.topic.replication.factor":"3"
}
}
版本:
[2020-12-22 15:27:41,640] INFO Kafka version: 2.2.2-cp3 (org.apache.kafka.common.utils.AppInfoParser)
[2020-12-22 15:27:41,640] INFO Kafka commitId: 602b2e2e105b4d34 (org.apache.kafka.common.utils.AppInfoParser)
它可能是一个 JDK 错误:https ://bugs.openjdk.java.net/browse/JDK-8015099 。它已在 JDK 9+ 中修复。
Confluent docker 镜像confluentinc/cp-kafka-connect:5.2.4
使用 JDK8:
openjdk version "1.8.0_172"
OpenJDK Runtime Environment (Zulu 8.30.0.1-linux64) (build 1.8.0_172-b01)
OpenJDK 64-Bit Server VM (Zulu 8.30.0.1-linux64) (build 25.172-b01, mixed mode)
关于可能出错的任何其他想法?