0

作为初学者,我正在探索 Apache Kafka 和 confluent-kafka-python 客户端。当我尝试从生产者发送简单消息时,消费者能够成功消费消息。以为我会尝试将图像作为有效负载发送。所以继续使用 1MB(png) 图像,我的制作人无法生成消息。我遇到的错误是

  p.produce('mytopic', callback=delivery_report, key='hello', value=str_value)
cimpl.KafkaException: KafkaError{code=MSG_SIZE_TOO_LARGE,val=10,str="Unable to produce message: Broker: Message size too large"}

虽然我做了一些谷歌搜索发现Kafka - Broker: Message size too largeHow can I send large messages with Kafka (over 15MB)? 所以我修改了我的 server.props(broker side),如下所示:

############################# Server Basics #############################

# The id of the broker. This must be set to a unique integer for each broker.
broker.id=0
max.message.bytes=1048576 
message.max.bytes=1048576 
replica.fetch.max.bytes=1048576

但我仍然无法解决这个问题。

生产者.py

from confluent_kafka import Producer
import base64
import time

# some_data_source = ['hey', 'hi']

with open("1mb.png", "rb") as imageFile:
    str_value = base64.b64encode(imageFile.read())

p = Producer({'bootstrap.servers': 'localhost:9092', 'compression.type': 'snappy'})

def delivery_report(err, msg):
    """ Called once for each message produced to indicate delivery result.
        Triggered by poll() or flush(). """
    if err is not None:
        print('Message delivery failed: {}'.format(err))
    else:
        print('Message delivered to {} [{}]'.format(msg.topic(), msg.partition()))

for _ in range(2):
    # Trigger any available delivery report callbacks from previous produce() calls
    p.poll(0)

    # Asynchronously produce a message, the delivery report callback
    # will be triggered from poll() above, or flush() below, when the message has
    # been successfully delivered or failed permanently.
    p.produce('mytopic', callback=delivery_report, key='hello', value=str_value)

# Wait for any outstanding messages to be delivered and delivery report
# callbacks to be triggered.
p.flush()

消费者.py

from confluent_kafka import Consumer


c = Consumer({
    'bootstrap.servers': 'localhost:9092',
    'group.id': 'mygroup',
    'auto.offset.reset': 'earliest'
})

c.subscribe(['mytopic'])

while True:
    msg = c.poll(1.0)

    if msg is None:
        continue
    if msg.error():
        print("Consumer error: {}".format(msg.error()))
        continue

    print('Received message: {}'.format(msg.value().decode('utf-8')))

c.close()

我需要添加任何参数还是我在配置中遗漏了什么?任何帮助,将不胜感激。

谢谢

4

1 回答 1

1

看起来您并没有对代理默认值进行太多更改;它仍然在 1MB 左右。

对于您的客户端错误,您需要添加message.max.bytes到生产者配置

如果您需要任何其他客户端属性,例如消费者最大获取字节数,这些都记录在此处

https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md


总体而言,建议将您的图像上传到集中式文件存储,然后通过 Kafka 将其 URI 位置作为纯字符串发送。这将增加吞吐量并减少代理的存储需求,尤其是当您在多个主题上发送/复制相同的图像数据时

于 2021-04-10T14:25:39.897 回答