6

I was looking into ruby's parallel/asynchronous processing capabilities and read many articles and blog posts. I looked through EventMachine, Fibers, Revactor, Reia, etc, etc. Unfortunately, I wasn't able to find a simple, effective (and non-IO-blocking) solution for this very simple use case:

File.open('somelogfile.txt') do |file|
  while line = file.gets      # (R) Read from IO
    line = process_line(line) # (P) Process the line
    write_to_db(line)         # (W) Write the output to some IO (DB or file)
  end
end

Is you can see, my little script is performing three operations read (R), process (P) & write (W). Let's assume - for simplicity - that each operation takes exactly 1 unit of time (e.g. 10ms), the current code would therefore do something like this (5 lines):

Time:       123456789012345 (15 units in total)
Operations: RPWRPWRPWRPWRPW

But, I would like it to do something like this:

Time:       1234567 (7 units in total)
Operations: RRRRR
             PPPPP
              WWWWW

Obviously, I could run three processes (reader, processor & writer) and pass read lines from reader into the processor queue and then pass processed lines into the writer queue (all coordinated via e.g. RabbitMQ). But, the use-case is so simple, it just doesn't feel right.

Any clues on how this could be done (without switching from Ruby to Erlang, Closure or Scala)?

4

2 回答 2

3

如果您需要它真正并行(来自单个进程),我相信您将不得不使用 JRuby 来获得真正的本机线程并且没有 GIL。

您可以使用 DRb 之类的东西在多个进程/内核之间分配处理,但对于您的用例来说,这有点多。相反,您可以尝试让多个进程使用管道进行通信:

$ cat somelogfile.txt | ruby ./proc-process | ruby ./proc-store

在这种情况下,每个部分都是自己的进程,可以并行运行,但使用 STDIN / STDOUT 进行通信。这可能是解决您的问题的最简单(也是最快)的方法。

# proc-process
while line = $stdin.gets do
  # do cpu intensive stuff here
  $stdout.puts "data to be stored in DB"
  $stdout.flush # this is important
end

# proc-store
while line = $stdin.gets do
  write_to_db(line)
end
于 2010-10-25T17:41:47.390 回答
1

查看桃子 ( http://peach.rubyforge.org/ )。做一个平行的“每个”再简单不过了。但是,正如文档所述,您需要在 JRuby 下运行才能使用 JVM 的本机线程。

有关各种 Ruby 解释器的多线程功能的详细信息,请参阅 Jorg Mittag 对这个 SO 问题的回复。

于 2010-10-25T13:06:04.833 回答