15

I have n (typically n < 10 but it should scale) processes running on different machines and communicating through amqp using RabbitMQ. Processes are typically long running and may be implemented in any language (though most are java/python).

Each process requires a number of inputs (numbers/strings) and produces a number of outputs (also just numbers or strings). Executing a process happens asynchronously: sending a message on its input queue and waiting for a callback to be triggered by the output queue.

Ideally the user specifies some inputs and desired outputs and the system should:

  • detect which processes are needed and generate the dependency graph
  • topologically sort the graph and execute it, node transitions will need to be event driven

A node should fire if its input is ready, allowing parallelism per branch. I can assume no cycles for now, but eventually there will be cycles (e.g., two processes may need to iterate until the output no longer changes).

This should be a known problem from (data)flow programming (discussed here before) and I want to avoid re-inventing the wheel. I would prefer a python solution and a search leads to Trellis and Pypes. Trellis is no longer developed but seems to support cycles, while pypes does not. Also not sure how actively developed pypes is.

Further searches reveal a whole list of event based programming frameworks, none of which I am particularly knowledgeable about. There are of course workflow environments like Taverna and KNIME, but that seems overkill.

Does anybody have any experience tackling this type of problem or with the libraries mentioned?

Edit: Other libraries I found are:

4

2 回答 2

5

python.org has a Wiki page on "Flow Based Programming" -- http://wiki.python.org/moin/FlowBasedProgramming

于 2012-10-18T00:46:45.730 回答
1

底线是,如果您可以用您完全理解并可以记录的少量代码行(几百行)重新发明轮子,那么就去做。

考虑到一些基本的基础工具,这是一个使用的抽象并不难实现的领域。RabbitMQ 就是这样一个工具。Node.js 是另一个。周围有很多库实现了管理数据流、工作流、有限状态机等的有用方法,但它们有很多重叠并且往往不完整。可能最初的开发人员刚刚构建的足够解决他最初的问题,并且由于这种类型的编程不是那么流行,所以没有足够的临界质量来继续开发。

要按受欢迎程度对所有可能的解决方案进行排名,选择最受欢迎的解决方案,并努力使其发挥作用(当然,在分享您的工作的同时),有很多话要说。

于 2011-04-26T07:10:24.643 回答