1

Problem: I have a number of file uploads coming via HTTP in parallel ( uploads receiver ). I'm storing them temporarily on a local disk. Another process ( uploads submitter ) gets notified about new uploads and does specific processing ( parsing, extracting metadata, uploading to S3 etc ). Once upload processing done I want uploads receiver to be notified by submitter to reply back with status ( whether submission is ok or error ) to the remote uploader. Using ZeroMQ PUB/SUB pattern, what would be better:

  • subscribe all upload receiver threads to a single topic. Each receiver thread would have to filter messages based on upload id or something to find a notification that belongs to it.
  • subscribe each receiver thread to a new topic which represents particular upload. This one seems more reasonable assuming topics are cheap in ZeroMQ, i.e. not much resources is needed to keep them and they can be auto-expired. I expect new uploads to come at dozens of files per second, single upload processing may take up to several seconds so theoretically I can have up to thousand of topics active at the same moment of time. Also I may not always be able to unsubscribe due to various failure modes.
4

1 回答 1

1

初步通知:
关于使用不同的 ZeroMQ 版本号:

虽然较新的版本可能使用PUB端主题过滤,但早期的 ZeroMQ 版本确实使用SUB端方法,这意味着所有(网络)消息传输流量都转到所有SUB-s 作为分配处理工作负载的可接受惩罚,即否则需要在PUB-side 以尽可能低的延迟进行处理。

这对于在开放分布式系统关联中版本的同质性不可强制执行的情况很重要。

尽管您的设计架构似乎位于同一位置<localhost>,但如果在此用例扩展期间出现整体瓶颈,性能影响仍然是非分布式的(集中的)并且可能仅涉及一些有限的延迟/优先级调整。


关于可扩展性范围 - 限制仍然比您的用例更远:

正如 Martin Sustrik(ZeroMQ 的共同父亲)详细介绍的那样,ZeroMQ 的设计预期规模可达几万:

(cit.:) "高效的订阅匹配
在 ZeroMQ 中,简单的尝试用于存储和匹配PUB/SUB订阅。订阅机制旨在用于多达 10,000 个订阅,其中简单的尝试运行良好。但是,有些用户使用多达 150,000,000 个订阅. 在这种情况下,需要更有效的数据结构。”

在 Martin 的这篇文章中可能会发现有关设计和缩放的更多细节。


最好的下一步?

一个公平的方法是模拟每个有问题的方法并对其进行基准测试,缩放到体外预期静态比例的 { 1.0x , 1.5x , 2.0x , 5.0x },以获得关于实际开销的定量支持数据,与正在审查的替代策略相关的性能和延迟。

无论如何,Vovan,享受分布式处理中的智能信令/消息传递的世界。

于 2016-09-12T08:46:33.260 回答