我在从上帝那里运行 resque 工人时遇到问题。
这是我的上帝配置
num_workers = 9
queue = '*'
current_path = "/u/apps/narg/current"
God.pid_file_directory = "/u/apps/narg/current/tmp/pids"
num_workers.times do |num|
God.watch do |w|
w.name = "resque-#{num}"
w.group = "resque_all"
w.interval = 30.seconds
w.env = {"QUEUE"=>queue, "RAILS_ENV"=>"production",
'PIDFILE' => "#{current_path}/tmp/pids/#{w.name}.pid" }
w.start = "cd #{current_path} ; bundle exec rake environment resque:work"
w.log = "#{current_path}/log/god-#{w.name}.log"
w.pid_file = "#{current_path}/tmp/pids/#{w.name}.pid"
w.uid = 'root'
w.gid = 'root'
w.behavior(:clean_pid_file)
# retart if memory gets too high
w.transition(:up, :restart) do |on|
on.condition(:memory_usage) do |c|
c.above = 150.megabytes
c.times = 2
end
end
# determine the state on startup
w.transition(:init, { true => :up, false => :start }) do |on|
on.condition(:process_running) do |c|
c.running = true
c.interval = 5
end
end
# determine when process has finished starting
w.transition([:start, :restart], :up) do |on|
on.condition(:process_running) do |c|
c.running = true
c.interval = 5.seconds
end
# failsafe
on.condition(:tries) do |c|
c.times = 5
c.transition = :start
c.interval = 5.seconds
end
end
# start if process is not running
w.transition(:up, :start) do |on|
on.condition(:process_running) do |c|
c.running = false
end
end
end
end
当我开始上帝时,当工人正在工作时,一切看起来都很好,并且它运行所有过渡到:向上。
但是当工人不运行时,它会在启动后停止。工作人员实际上已经启动,pid 文件也是正确的。只有上帝不明白:
** [out :: narg-wrk02] I [2012-08-23 11:40:48] INFO: resque-7 move 'unmonitored' to 'init'
** [out :: narg-wrk02] I [2012-08-23 11:40:48] INFO: resque-7 moved 'unmonitored' to 'init'
** [out :: narg-wrk02] I [2012-08-23 11:40:48] INFO: resque-7 [trigger] process is not running (ProcessRunning)
** [out :: narg-wrk02] I [2012-08-23 11:40:48] INFO: resque-7 move 'init' to 'start'
** [out :: narg-wrk02] I [2012-08-23 11:40:48] INFO: resque-7 before_start: deleted pid
** [out :: narg-wrk02] I [2012-08-23 11:40:48] INFO: resque-7 start: cd /u/apps/narg/current ; bundle exec rake environment resque:work
如前所述,工人已启动并运行良好。pid 文件也包含正确的 pid。
如果我现在杀死上帝并重新启动它,它会很好地识别正在运行的工作人员并转换为 :up..
任何想法或指示?