我们有一个由 Web 请求启动的长时间运行的进程。为了让进程有时间完成,我们在一个新线程上将其分离出来,并使用互斥锁来确保只有一个进程实例可以运行。此代码在我们的开发和登台环境中按预期运行,但在我们的生产环境中失败并出现空引用异常。我们的应用程序日志没有捕获任何内容,我们的操作人员报告说它正在使 AppPool 崩溃。(这似乎是一个环境问题,但我们必须继续假设环境配置相同。)到目前为止,我们无法确定 Null Reference 的位置。
以下是应用程序事件日志中的例外情况:
Exception: System.NullReferenceException
Message: Object reference not set to an instance of an object.
StackTrace: at Jobs.LongRunningJob.DoWork()
at System.Threading.ExecutionContext.runTryCode(Object userData)
at System.Runtime.CompilerServices.RuntimeHelpers.ExecuteCodeWithGuaranteedCleanup(TryCode code, CleanupCode backoutCode, Object userData)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean ignoreSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart()
这是代码(稍微消毒):
public class LongRunningJob: Job
{
private static Mutex mutex = new Mutex();
protected override void PerformRunJob()
{
var ts = new ThreadStart(LongRunningJob.DoWork);
var thd = new Thread(ts);
thd.IsBackground = true;
thd.Start();
}
private static void DoWork()
{
var commandTimeOut = 180;
var from = DateTime.Now.AddHours(-24);
var to = DateTime.Now;
if (mutex.WaitOne(TimeSpan.Zero))
{
try
{
DoSomethingExternal(); // from what we can tell, this is never called
}
catch (SqlException sqlEx)
{
if (sqlEx.InnerException.Message.Contains("timeout period elapsed"))
{
Logging.LogException(String.Format("Command timeout in LongRunningJob: CommandTimeout: {0}", commandTimeOut), sqlEx);
}
else
{
Logging.LogException(String.Format("SQL exception in LongRunningJob: {0}", sqlEx.InnerException.Message), sqlEx);
}
}
catch (Exception ex)
{
Logging.LogException(String.Format("Error processing data in LongRunningJob: {0}", ex.InnerException.Message), ex);
}
finally
{
mutex.ReleaseMutex();
}
}
else
{
Logging.LogMessage("LongRunningJob is already running.");
}
}
}