几周以来,我面临着一个巨大的问题。我有一个托管在 IIS7 (W2008 SP1) 下的 asp.net 应用程序,每隔几个小时它就会开始消耗近 50% 的 CPU,而可能没有用户连接。这是可以理解的,因为我们正在使用 Quartz.net 进行一些应用程序回收,但我们还无法重现该问题。
这是在 CPU 较高时使用 JetBrains dotTrace 3.1 进行的跟踪:http: //mycenter.info/tmp/DotTraceSnapshot.zip
通常浪费 CPU 的进程是 w3wp.exe,但最近几天 sqlserver (2008) 和 memcached (1.2.1,周一更新到 1.2.4 beta) 也在消耗 CPU。奇怪的是,有时 memcached 开始消耗 100% 并且它的统计数据显示它很安静,但是在发出请求时它工作正常。
这是 w3wp 的崩溃转储(或堆栈跟踪转储),使用 WinDbg:(基于本指南:http: //blogs.technet.com/marcelofartura/archive/2006/09/15/troubleshooting-iis-100-cpu -issues-step-by-step-intermediary.aspx )
0:000> ~
. 0 Id: 1be4.1d3c Suspend: 1 Teb: 7ffdf000 Unfrozen
1 Id: 1be4.b1c Suspend: 1 Teb: 7ffde000 Unfrozen
2 Id: 1be4.12a0 Suspend: 1 Teb: 7ffdd000 Unfrozen
3 Id: 1be4.19d0 Suspend: 1 Teb: 7ffdc000 Unfrozen
4 Id: 1be4.1714 Suspend: 1 Teb: 7ffd7000 Unfrozen
5 Id: 1be4.1a18 Suspend: 1 Teb: 7ffd6000 Unfrozen
6 Id: 1be4.12ac Suspend: 1 Teb: 7ffd5000 Unfrozen
7 Id: 1be4.dec Suspend: 1 Teb: 7ffd4000 Unfrozen
8 Id: 1be4.1e48 Suspend: 1 Teb: 7ffd8000 Unfrozen
9 Id: 1be4.1ca8 Suspend: 1 Teb: 7ffd3000 Unfrozen
10 Id: 1be4.1508 Suspend: 1 Teb: 7ffaf000 Unfrozen
11 Id: 1be4.1bc0 Suspend: 1 Teb: 7ffae000 Unfrozen
12 Id: 1be4.1f48 Suspend: 1 Teb: 7ffad000 Unfrozen
13 Id: 1be4.1994 Suspend: 1 Teb: 7ffac000 Unfrozen
14 Id: 1be4.1a48 Suspend: 1 Teb: 7ffab000 Unfrozen
15 Id: 1be4.12c8 Suspend: 1 Teb: 7ffa8000 Unfrozen
16 Id: 1be4.e44 Suspend: 1 Teb: 7ffa7000 Unfrozen
17 Id: 1be4.19e0 Suspend: 1 Teb: 7ffa6000 Unfrozen
18 Id: 1be4.19b0 Suspend: 1 Teb: 7ffa2000 Unfrozen
19 Id: 1be4.1b30 Suspend: 1 Teb: 7ffd9000 Unfrozen
20 Id: 1be4.1bfc Suspend: 1 Teb: 7ffa3000 Unfrozen
21 Id: 1be4.1be8 Suspend: 1 Teb: 7ffa1000 Unfrozen
22 Id: 1be4.1a54 Suspend: 1 Teb: 7ffa5000 Unfrozen
23 Id: 1be4.b74 Suspend: 1 Teb: 7ff3d000 Unfrozen
24 Id: 1be4.19b4 Suspend: 1 Teb: 7ff3c000 Unfrozen
25 Id: 1be4.1460 Suspend: 1 Teb: 7ffdb000 Unfrozen
26 Id: 1be4.1eac Suspend: 1 Teb: 7ffaa000 Unfrozen
27 Id: 1be4.1b90 Suspend: 1 Teb: 7ffa4000 Unfrozen
0:023> #23s
Search address set to 77dc9a94
*** WARNING: Unable to verify checksum for SMDiagnostics.ni.dll
*** WARNING: Unable to verify checksum for System.Data.ni.dll
*** ERROR: Module load completed but symbols could not be loaded for Microsoft.Web.Services3.DLL
*** WARNING: Unable to verify checksum for System.Windows.Forms.ni.dll
*** WARNING: Unable to verify checksum for System.Web.ni.dll
*** WARNING: Unable to verify checksum for Ademy.UI.Web.DLL
*** ERROR: Module load completed but symbols could not be loaded for AjaxControlToolkit.DLL
*** ERROR: Module load completed but symbols could not be loaded for 7zSharp.DLL
*** WARNING: Unable to verify checksum for mscorlib.ni.dll
*** ERROR: Module load completed but symbols could not be loaded for Iesi.Collections.DLL
*** WARNING: Unable to verify checksum for System.Design.ni.dll
*** WARNING: Unable to verify checksum for System.Core.ni.dll
*** WARNING: Unable to verify checksum for Ademy.Event.DLL
*** WARNING: Unable to verify checksum for System.ServiceModel.ni.dll
*** ERROR: Module load completed but symbols could not be loaded for System.ServiceModel.ni.dll
*** WARNING: Unable to verify checksum for App_Theme_Ocean.wgubmrqt.dll
*** WARNING: Unable to verify checksum for NHibernate.Burrow.AppBlock.DLL
*** ERROR: Module load completed but symbols could not be loaded for NHibernate.Burrow.AppBlock.DLL
*** WARNING: Unable to verify checksum for NHibernate.Caches.SysCache2.DLL
*** ERROR: Module load completed but symbols could not be loaded for NHibernate.Caches.SysCache2.DLL
*** WARNING: Unable to verify checksum for Ademy.UI.Web.Controls.DLL
*** WARNING: Unable to verify checksum for Microsoft.JScript.ni.dll
*** WARNING: Unable to verify checksum for System.Web.Mobile.ni.dll
*** WARNING: Unable to verify checksum for System.Runtime.Serialization.ni.dll
^ Memory access error in '#23s'
0:023> kb
ChildEBP RetAddr Args to Child
11c6ede4 77dc8ed4 766bc622 0000038c 00000000 ntdll!KiFastSystemCallRet
11c6ede8 766bc622 0000038c 00000000 11c6ee20 ntdll!NtSetEvent+0xc
11c6edf8 011011ef 0000038c 7f52be6e 0fda4888 kernel32!SetEvent+0x10
WARNING: Frame IP not in any known module. Following frames may be wrong.
11c6ee20 71b26ffe 060c5f9c 010039b0 010628a0 0x11011ef
*** WARNING: Unable to verify checksum for System.ni.dll
11c6ee4c 712c4b14 02528958 060c5f9c 11c6ee94 mscorlib_ni+0x216ffe
11c6ee5c 712c4abe 060c5fb0 02528958 060c600c System_ni+0x144b14
11c6ee94 71679260 060c5d24 7167926d 060c5d24 System_ni+0x144abe
11c6eec8 717d8373 060c5d24 11c6f3e8 712c4ce4 System_ni+0x4f9260
11c6ef14 712c4ce4 00000000 02528930 11c6ef74 System_ni+0x658373
11c6ef54 7129dbcb 098b6ac4 11c6efec 72f7eff8 System_ni+0x144ce4
11c6efa4 71b26d66 02df349c 11c6efc0 71b45681 System_ni+0x11dbcb
11c6efb0 71b45681 00000000 0dcfd2d8 11c6efd0 mscorlib_ni+0x216d66
11c6efc0 72f11b4c 766b45f1 00000000 11c6f050 mscorlib_ni+0x235681
11c6efd0 72f221f9 11c6f0a0 00000000 11c6f070 mscorwks!CallDescrWorker+0x33
11c6f050 72f36571 11c6f0a0 00000000 11c6f070 mscorwks!CallDescrWorkerWithHandler+0xa3
11c6f194 72f365a4 71a91ff0 11c6f2c8 11c6f1e8 mscorwks!MethodDesc::CallDescr+0x19c
11c6f1b0 72f365c2 71a91ff0 11c6f2c8 11c6f1e8 mscorwks!MethodDesc::CallTargetWorker+0x1f
11c6f1c8 7302a471 11c6f1e8 68e9b644 0dcfd2d8 mscorwks!MethodDescCallSite::CallWithValueTypes+0x1a
11c6f394 7302a5c6 11c6f424 68e9b194 02df34e4 mscorwks!ExecuteCodeWithGuaranteedCleanupHelper+0x9f
11c6f444 71b45577 11c6f3e8 02df17d0 01c177f8 mscorwks!ReflectionInvocation::ExecuteCodeWithGuaranteedCleanup+0x10f
提前感谢任何提示!
更新:
这是挂起线程的托管堆栈:我认为它看起来像 memcached 提供程序,但还不确定我应该做什么。
0:023> !clrstack
OS Thread Id: 0xb74 (23)
ESP EIP
11c6ee38 77dc9a94 [NDirectMethodFrameStandaloneCleanup: 11c6ee38] Microsoft.Win32.Win32Native.SetEvent(Microsoft.Win32.SafeHandles.SafeWaitHandle)
11c6ee48 71b26ffe System.Threading.EventWaitHandle.Set()
11c6ee54 712c4b14 System.Net.TimerThread.Prod()
11c6ee64 712c4abe System.Net.TimerThread+TimerQueue.CreateTimer(Callback, System.Object)
11c6eea0 71679260 System.Net.ConnectionPool.CleanupCallbackWrapper(Timer, Int32, System.Object)
11c6eed4 717d8373 System.Net.TimerThread+TimerNode.Fire()
11c6ef1c 712c4ce4 System.Net.TimerThread+TimerQueue.Fire(Int32 ByRef)
11c6ef5c 7129dbcb System.Net.TimerThread.ThreadProc()
11c6efac 71b26d66 System.Threading.ThreadHelper.ThreadStart_Context(System.Object)
11c6efb8 71b45681 System.Threading.ExecutionContext.runTryCode(System.Object)
11c6f3e8 72f11b4c [HelperMethodFrame_PROTECTOBJ: 11c6f3e8] System.Runtime.CompilerServices.RuntimeHelpers.ExecuteCodeWithGuaranteedCleanup(TryCode, CleanupCode, System.Object)
11c6f450 71b45577 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
11c6f46c 71b301c5 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
11c6f484 71b26ce4 System.Threading.ThreadHelper.ThreadStart()
11c6f6b0 72f11b4c [GCFrame: 11c6f6b0]
11c6f9a0 72f11b4c [ContextTransitionFrame: 11c6f9a0]
找到的解决方案:
这是由于在 Windows 2008 上运行时用于 Win32 的 memcached 1.2.1 中的一个错误。我更新到 v1.2.6 并且一切正常。我想我看到的是 w3wp 进程,因为我用来连接到 memcached 的库有一个正在挂起的回收进程,即使 memcached 仍在响应。
找到解决方案 2:
如果第一个解决方案不起作用,请阅读这篇文章。我猜 memcached 解决方案只是隐藏了真正的问题,这是 SmtpClient 中的一个错误。