在我的绳索的最后一点:) 我有一个原型(太大,有太多的依赖关系无法共享),它出于多种原因使用 redis - 其中一个是存储序列化值,并控制更新到通过使用带有LockTake/Release
单独钥匙的防护锁来获得该值。
整个应用看起来有点像这样(注意:这个片段不能重现我的问题!):
using Nito.AsyncEx;
using StackExchange.Redis;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace RedisAzureLockingTest
{
class Program
{
static void Main(string[] args)
{
AsyncContext.Run(async () =>
{
var cm = await ConnectionMultiplexer.ConnectAsync("blah:6379,ssl=false,password=blah,defaultDatabase=1,syncTimeout=5000");
var db = cm.GetDatabase();
// store key
RedisKey key = "thisisourtest";
await db.StringSetAsync(key, "initial value", flags: CommandFlags.DemandMaster); // SET
// acquire lock
RedisKey lockkey = "thisisourtest.lock";
string locktoken = Guid.NewGuid().ToString();
bool success = await db.LockTakeAsync(lockkey, locktoken, TimeSpan.FromDays(1), CommandFlags.DemandMaster);
if (!success) throw new InvalidOperationException("Sure ok - lock couldnt be taken");
try
{
// do some stuff whilst the lock is taken
var oldval = await db.StringGetAsync(key, CommandFlags.DemandMaster);
if (oldval.IsNullOrEmpty) throw new InvalidOperationException("Key doesnt exist");
// persist an update
var newval = Guid.NewGuid().ToString();
await db.StringSetAsync(key, newval, flags: CommandFlags.DemandMaster); // SET
}
finally
{
// release lock
if (!await db.LockReleaseAsync(lockkey, locktoken, CommandFlags.DemandMaster))
throw new InvalidOperationException("Should never occur - we couldnt release our own lock is now locked forever!");
// double check that the lock has been released
var locktok2 = await db.LockQueryAsync(lockkey, CommandFlags.DemandMaster);
if (locktok2.HasValue) throw new InvalidOperationException("Should never occur - we couldnt release our own lock is now locked forever! Even worse- lock release lied about releasing itself");
}
Console.WriteLine("WORKED");
});
Console.ReadLine();
}
}
}
我一直在本地使用一个简单的 redis 单个实例进行测试,从来没有遇到任何问题,现在我在另一个环境中尝试并一直在使用 Azure C0 Basic 实例。半可靠(我现在已经设法将我的代码库的本地副本设置为指向 Azure 实例)我可以重现问题 - 但不知道可能出了什么问题或如何进一步调试问题。
我观察到的行为是:
LockTakeAsync
工作正常- 我的“做事”位执行正常
LockReleaseAsync
似乎成功(返回 TRUE),但未从 redis 中删除锁定键(使用 cmdline redis-cli 工具确认)。
我试过了:
- 使用追踪
ConnectionMultiplexer
- 没有任何不愉快的事情出现 - 将我的应用程序缩减为一个简单的测试用例(见上文) - 但这不会重现问题,所以它必须是外部的
LockTake/LockRelease
切换到调用的非异步版本- 问题仍然存在- 添加了围绕
LockReleaseAsync
调用的日志记录(包括SET
对 redis 的“DEBUG”调用以跟踪 - 见下文)以确认事件的确切顺序 - 在上面的代码片段中添加了
LockQueryAsync
调用以确认锁仍然被持有! - 指定我的所有命令都应在主副本上执行。
唯一能MONITOR
在 redis 实例上运行并捕获 SE.Redis 正在做什么的痕迹。当它在本地执行并且应用程序正常工作时,我会得到类似(键名已更改):
1462360519.322029 [1 37.157.34.228:1995] "SET" "thisisourtest.93e6ca1d-3008-475f-92cb-2e7784e625cb.lock" "647672fd-ae06-4b6e-be67-341ac583a366" "EX" "86400" "NX"
1462360519.332884 [1 37.157.34.228:1995] "GET" "thisisourtest.93e6ca1d-3008-475f-92cb-2e7784e625cb"
1462360519.342668 [1 37.157.34.228:1995] "SET" "thisisourtest.93e6ca1d-3008-475f-92cb-2e7784e625cb" "ChIJHcrmkwgwX0cRkssud4TmJcsQARoSCQAAAAAAAAAAEQADAAAAAAAHIgNHQlAyCQi+koDo9oL9GToJCL6SgOj2gv0ZQgkIvvqI77mD/Rk="
1462360519.354847 [1 37.157.34.228:1995] "GET" "thisisourtest.93e6ca1d-3008-475f-92cb-2e7784e625cb.lock"
1462360519.364666 [1 37.157.34.228:1995] "SET" "DEBUG" "1"
1462360519.387834 [1 37.157.34.228:1995] "WATCH" "thisisourtest.93e6ca1d-3008-475f-92cb-2e7784e625cb.lock"
1462360519.387866 [1 37.157.34.228:1995] "GET" "thisisourtest.93e6ca1d-3008-475f-92cb-2e7784e625cb.lock"
1462360519.401686 [1 37.157.34.228:1995] "MULTI"
1462360519.401708 [1 37.157.34.228:1995] "DEL" "thisisourtest.93e6ca1d-3008-475f-92cb-2e7784e625cb.lock"
1462360519.401726 [1 37.157.34.228:1995] "EXEC"
1462360519.414845 [1 37.157.34.228:1995] "SELECT" "1"
1462360519.414862 [1 37.157.34.228:1995] "SET" "DEBUG" "2"
1462360519.424950 [1 37.157.34.228:1995] "GET" "thisisourtest.93e6ca1d-3008-475f-92cb-2e7784e625cb.lock"
1462360519.452993 [1 37.157.34.228:1995] "GET" "thisisourtest.93e6ca1d-3008-475f-92cb-2e7784e625cb"
当我针对 Azure 运行并重新创建问题时,我得到:
1462359810.253275 [1 23.97.166.137:1277] "SET" "thisisourtest.c26119b3-cc3e-48f1-8834-6c7b1d3afb69.lock" "35be6e88-7240-4772-ac2d-220a57ed1a79" "EX" "86400" "NX"
1462359810.256639 [1 23.97.166.137:1277] "GET" "thisisourtest.c26119b3-cc3e-48f1-8834-6c7b1d3afb69"
1462359810.258605 [1 23.97.166.137:1277] "SET" "thisisourtest.c26119b3-cc3e-48f1-8834-6c7b1d3afb69" "ChIJsxlhwj7M8UgRiDRsex06+2kQARoSCQAAAAAAAAAAEQADAAAAAAADIgNHQlAyCQi1nJ6N3IL9GToJCLWcno3cgv0ZQgkItYSnlJ+D/Rk="
1462359810.260233 [1 23.97.166.137:1277] "GET" "thisisourtest.c26119b3-cc3e-48f1-8834-6c7b1d3afb69.lock"
1462359810.262790 [1 23.97.166.137:1277] "SET" "DEBUG" "1"
1462359810.283693 [1 23.97.166.137:1277] "WATCH" "thisisourtest.c26119b3-cc3e-48f1-8834-6c7b1d3afb69.lock"
1462359810.283724 [1 23.97.166.137:1277] "GET" "thisisourtest.c26119b3-cc3e-48f1-8834-6c7b1d3afb69.lock"
1462359812.329321 [0 23.97.166.137:1257] "UNSUBSCRIBE" "U\xc7\n\xae\xa7\x1c\x84K\x8f\x1ft\x00\\j\xc2j"
1462359812.329374 [3 23.97.166.137:1256] "INFO" "replication"
1462359812.357770 [1 23.97.166.137:1259] "INFO" "replication"
1462359814.186895 [0 23.97.166.137:1312] "INFO" "replication"
1462359815.285593 [1 23.97.166.137:1277] "UNWATCH"
1462359815.285621 [1 23.97.166.137:1277] "SET" "DEBUG" "2"
1462359815.292618 [1 23.97.166.137:1277] "GET" "thisisourtest.c26119b3-cc3e-48f1-8834-6c7b1d3afb69.lock"
1462359815.302945 [1 23.97.166.137:1277] "GET" "thisisourtest.c26119b3-cc3e-48f1-8834-6c7b1d3afb69"
对我来说,看起来好像WATCH
失败了(是否涉及 Azure 中的复制?)并且发布失败。SE.Redis似乎DEL
没有发出命令(没关系)。MULTI/EXEC
但是LockReleaseAsync
没有报告这一点 - 我也无法在 MONITOR 日志中看到影响相关密钥的呼叫。
难住了。
关于如何进一步隔离它的任何想法?尝试构建一个小测试用例不会很快。
干杯!