1

[我在github上发布的交叉发布问题以扩大受众范围]

delay在事务中使用该方法时,我们看到了一种奇怪的行为。它可以很容易地在带有User模型和以下代码的 rails 控制台中复制。

有谁知道delayed_job 是否在内部做了一些改变数据库操作顺序的事情,如下所示?

重现步骤

# User model has first_name and last_name attributes
# users table has at least one test record

# The code below is for use in a rails console. Just copy, paste and run.

# a simple operation to queue up as a delayed job
def change_user_name(user_id)
   u = User.find(user_id)
   u.first_name = "Delaid"
   u.save
end

User.transaction do
  u2 = User.first
  u2.last_name = "Willie"
  u2.save
  delay.change_user_name(u2.id)
  u2.last_name = "Wonka"
  u2.save
end

预期行为

当它运行时,我们期望插入到delayed_jobs表中的操作按照所示的顺序发生,并且在我们指定的事务中。我们在 rails 控制台日志中看到的内容证实了这一点,如下所示。

2015-02-03 09:17:55 -0600 (9606) : DEBUG :  (0.3ms)  BEGIN
2015-02-03 09:17:55 -0600 (9606) : DEBUG :  User Load (0.7ms)  SELECT `users`.* FROM `users` WHERE (user_type_id not in ('6')) LIMIT 1
2015-02-03 09:17:55 -0600 (9606) : DEBUG :  SQL (0.4ms)  INSERT INTO `history_logs` (`action`, `created_at`, `data`, `organization_id`, `updated_at`, `user_id`) VALUES ('update via on 50', '2015-02-03 15:17:55', '{\"last_name\":[\"van Tilburg\",\"Willie\"]}', NULL, '2015-02-03 15:17:55', 0)
2015-02-03 09:17:55 -0600 (9606) : DEBUG :  (0.4ms)  UPDATE `users` SET `last_name` = 'Willie', `updated_at` = '2015-02-03 15:17:55' WHERE `users`.`id` = 50
2015-02-03 09:17:55 -0600 (9606) : DEBUG :  (0.2ms)  BEGIN
2015-02-03 09:17:55 -0600 (9606) : DEBUG :  SQL (0.6ms)  INSERT INTO `delayed_jobs` (`attempts`, `created_at`, `failed_at`, `handler`, `last_error`, `locked_at`, `locked_by`, `priority`, `queue`, `run_at`, `updated_at`) VALUES (0, '2015-02-03 15:17:55', NULL, '--- !ruby/object:Delayed::PerformableMethod\nobject: !ruby/object\n prompt:\n :PROMPT_I: ! \'1.9.3p125 :%03n > \'\n :PROMPT_S: ! \'1.9.3p125 :%03n%l> \'\n :PROMPT_C: ! \'1.9.3p125 :%03n > \'\n :PROMPT_N: ! \'1.9.3p125 :%03n?> \'\n :RETURN: ! \" => %s \\n\"\n :AUTO_INDENT: true\nmethod_name: :change_user_name\nargs:\n- 50\n', NULL, NULL, NULL, 0, NULL, '2015-02-03 15:17:55', '2015-02-03 15:17:55')
2015-02-03 09:17:55 -0600 (9606) : DEBUG :  (9.4ms)  COMMIT
2015-02-03 09:17:55 -0600 (9606) : DEBUG :  SQL (0.4ms)  INSERT INTO `history_logs` (`action`, `created_at`, `data`, `organization_id`, `updated_at`, `user_id`) VALUES ('update via on 50', '2015-02-03 15:17:55', '{\"last_name\":[\"Willie\",\"Wonka\"]}', NULL, '2015-02-03 15:17:55', 0)
2015-02-03 09:17:55 -0600 (9606) : DEBUG :   (0.4ms)  UPDATE `users` SET `last_name` = 'Wonka', `updated_at` = '2015-02-03 15:17:55' WHERE `users`.`id` = 50
2015-02-03 09:17:55 -0600 (9606) : DEBUG :   (3.5ms)  COMMIT
 => true

观察到的(不正确的)行为

但是,mysql二进制日志显示了一些不同的内容,如下所示。请注意在其他操作之前INSERT进入delayed_jobs表是如何发生的,以及它是如何放置在自己的事务中的。

BEGIN
/*!*/;
# at 2278762
#150203  9:17:55 server id 1  end_log_pos 2278790       Intvar
SET INSERT_ID=24398/*!*/;
# at 2278790
#150203  9:17:55 server id 1  end_log_pos 2279486       Query   thread_id=290135        exec_time=0     error_code=0
SET TIMESTAMP=1422976675/*!*/;
INSERT INTO `delayed_jobs` (`attempts`, `created_at`, `failed_at`, `handler`, `last_error`, `locked_at`, `locked_by`, `priority`, `queue`, `run_at`, `updated_at`) VALUES (0, '2015-02-03 15:17:55', NULL, '--- !ruby/object:Delayed::PerformableMethod\nobject: !ruby/object\n  prompt:\n    :PROMPT_I: ! \'1.9.3p125 :%03n > \'\n    :PROMPT_S: ! \'1.9.3p125 :%03n%l> \'\n    :PROMPT_C: ! \'1.9.3p125 :%03n > \'\n    :PROMPT_N: ! \'1.9.3p125 :%03n?> \'\n    :RETURN: ! \" => %s \\n\"\n    :AUTO_INDENT: true\nmethod_name: :change_user_name\nargs:\n- 50\n', NULL, NULL, NULL, 0, NULL, '2015-02-03 15:17:55', '2015-02-03 15:17:55')
/*!*/;
# at 2279486
#150203  9:17:55 server id 1  end_log_pos 2279513       Xid = 129277545
COMMIT/*!*/;
BEGIN
/*!*/;
# at 2279592
#150203  9:17:55 server id 1  end_log_pos 2279620       Intvar
SET INSERT_ID=626645/*!*/;
# at 2279620
#150203  9:17:55 server id 1  end_log_pos 2279929       Query   thread_id=290137        exec_time=0     error_code=0
SET TIMESTAMP=1422976675/*!*/;
INSERT INTO `history_logs` (`action`, `created_at`, `data`, `organization_id`, `updated_at`, `user_id`) VALUES ('update via  on 50', '2015-02-03 15:17:55', '{\"last_name\":[\"van Tilburg\",\"Willie\"]}', NULL, '2015-02-03 15:17:55', 0)
/*!*/;
# at 2279929
#150203  9:17:55 server id 1  end_log_pos 2280106       Query   thread_id=290137        exec_time=0     error_code=0
SET TIMESTAMP=1422976675/*!*/;
UPDATE `users` SET `last_name` = 'Willie', `updated_at` = '2015-02-03 15:17:55' WHERE `users`.`id` = 50
/*!*/;
# at 2280106
#150203  9:17:55 server id 1  end_log_pos 2280134       Intvar
SET INSERT_ID=626646/*!*/;
# at 2280134
#150203  9:17:55 server id 1  end_log_pos 2280437       Query   thread_id=290137        exec_time=0     error_code=0
SET TIMESTAMP=1422976675/*!*/;
INSERT INTO `history_logs` (`action`, `created_at`, `data`, `organization_id`, `updated_at`, `user_id`) VALUES ('update via  on 50', '2015-02-03 15:17:55', '{\"last_name\":[\"Willie\",\"Wonka\"]}', NULL, '2015-02-03 15:17:55', 0)
/*!*/;
# at 2280437
#150203  9:17:55 server id 1  end_log_pos 2280613       Query   thread_id=290137        exec_time=0     error_code=0
SET TIMESTAMP=1422976675/*!*/;
UPDATE `users` SET `last_name` = 'Wonka', `updated_at` = '2015-02-03 15:17:55' WHERE `users`.`id` = 50
/*!*/;
# at 2280613
#150203  9:17:55 server id 1  end_log_pos 2280640       Xid = 129277538
COMMIT/*!*/;

额外细节

我们还测试了没有外部事务的场景(如下所示),并且看到了相同的行为。意思是,INSERTindelayed_jobs发生在其他代码之前并且无序。

# This code without an outer transaction results in the same buggy behavior
u2 = User.first
u2.last_name = "Willie"
u2.save
delay.change_user_name(u2.id)
u2.last_name = "Wonka"
u2.save

此外,我们是否在上述代码中的各种 AR 操作中使用save或都没有关系。save!这个问题无论如何都会发生。

版本信息

MySQL版本

Variable_name   Value
innodb_version  5.5.31
protocol_version    10
slave_type_conversions  
version 5.5.31-0ubuntu0.12.04.1-log
version_comment (Ubuntu)
version_compile_machine x86_64
version_compile_os  debian-linux-gnu

导轨和宝石

rails-3.2.17
delayed_job-4.0.4
delayed_job_active_record-4.0.2
4

1 回答 1

0

在delayed_job 和active_record 代码中进行了大量printf 调试后,我发现了问题所在。我们一直在使用一个名为的 gemdb_charmer来实现数据库的主 (rw) 和副本 (ro) 实例之间的连接切换,并且这个 gem 在幕后实现了连接切换逻辑。这会干扰 ActiveRecord 中的事务上下文逻辑。

以下猴子补丁修复了该问题。

我弄清楚了这个问题。我们正在使用 [now defunct] db_charmer gem 在主数据库和辅助数据库之间进行连接切换,这对延迟的作业模型类没有影响。这导致我们对延迟作业与其他调用使用不同的连接实例,并且将事务分开处理。

以下猴子补丁修复了该问题。

class Delayed::Backend::ActiveRecord::Job
  # use the same as what you use for all your other AR model classes
  db_magic :connection => :primary, :slave => :replica, :force_slave_reads => false
end

db_charmergem 不再受支持,我需要找出一个替代方案 :-( 同时,这个补丁应该可以工作。

于 2015-02-12T09:41:24.073 回答