在第二部分我们主要介绍针对各种内存问题的主要调试方式。在开始之前我还是推荐一个常规的调试步骤。
Part1我们提到过debugdiag工具以及如何通过其内建的rule来分析dump。
- 收集dump
- 通过DebugDiag选择Managed Memory Analysis规则来做一遍分析
- 根据DebugDiag分析报告通过windbg打开dump查看更多信息
大数据量datatable
首先尝试重现问题,写代码创建一张大表。
using System;
using System.Data;
public partial class _Default : System.Web.UI.Page
{
protected void Page_Load(object sender, EventArgs e)
{
CreateLargeDataTables();
Response.Write("Large Datatable, enjoy!");
}
void CreateLargeDataTables()
{
DataTable table = new DataTable();
for (int i = 0; i < 100; i++)
{
table.Columns.Add(string.Format("Column{0}", i));
}
for (int i = 0; i < 1000; i++)
{
DataRow row = table.NewRow();
for (int j = 0; j < 100; j++)
{
row[j] = string.Format("Column{0}Row{1}", i, j);
for (int k = 0; k < 100; k++)
{
row[j] += "StringConcatCauseHighMemoryAndHighCPU";
}
}
table.Rows.Add(row);
}
Application.Add("datatable", table);
}
}
将aspx网页部署到IIS,IE请求该网页。打开TaskManager观察内存使用情况。
内存使用量在持续增长,请求结束之后趋于平稳。
IIS Worker Process内存使用两805.8M。
我只发了一个请求。
可以肯定IIS WorkerProcess (w3wp.exe)中内存使用有些异常。直接抓一个full user dump (可以直接在TaskManage右键选择CreateDumpFile)。
通过Windbg打开Dump分析。
我从DebugDiag的安装目录C:\Program Files\DebugDiag\Exts\拷贝了psscor2到windbg的安装目录下,这样我就可以直接执行接下来的命令来加载psscor2这个强悍的托管代码调试扩展。
0:000>.load psscor2
用Address命令看一下内存总体使用情况,可以看到<unclassified>占用量最大,调试.NET程序大多数时候<unclassified>对应.NET的托管堆大小。所以我们目标明确了,问题的确出来托管堆里面。
0:000> !address -summary
--- Usage Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
Free 179 7ff`a2f97000 ( 7.999 Tb) 99.98%
<unclassified> 339 0`52ee1000 ( 1.296 Gb) 89.15% 0.02%
Image 641 0`087d4000 ( 135.828 Mb) 9.13% 0.00%
Heap 37 0`01137000 ( 17.215 Mb) 1.16% 0.00%
Stack 50 0`00808000 ( 8.031 Mb) 0.54% 0.00%
NlsTables 1 0`00033000 ( 204.000 kb) 0.01% 0.00%
TEB 17 0`00022000 ( 136.000 kb) 0.01% 0.00%
ActivationContextData 7 0`0000a000 ( 40.000 kb) 0.00% 0.00%
CsrSharedMemory 1 0`00005000 ( 20.000 kb) 0.00% 0.00%
PEB 1 0`00001000 ( 4.000 kb) 0.00% 0.00%
--- Type Summary (for busy) ------ RgnCount ----------- Total Size -------- %ofBusy %ofTotal
MEM_PRIVATE 232 0`52357000 ( 1.285 Gb) 88.38% 0.02%
MEM_IMAGE 823 0`0a2a2000 ( 162.633 Mb) 10.93% 0.00%
MEM_MAPPED 37 0`00a52000 ( 10.320 Mb) 0.69% 0.00%
--- State Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
MEM_FREE 181 7ff`a2fa5000 ( 7.999 Tb) 99.98%
MEM_COMMIT 868 0`3cfcf000 ( 975.809 Mb) 65.56% 0.01%
MEM_RESERVE 224 0`2007c000 ( 512.484 Mb) 34.43% 0.01%
--- Protect Summary (for commit) - RgnCount ----------- Total Size -------- %ofBusy %ofTotal
PAGE_READWRITE 250 0`32707000 ( 807.027 Mb) 54.22% 0.01%
PAGE_EXECUTE_READ 92 0`05af9000 ( 90.973 Mb) 6.11% 0.00%
PAGE_READONLY 352 0`04349000 ( 67.285 Mb) 4.52% 0.00%
PAGE_WRITECOPY 83 0`0090e000 ( 9.055 Mb) 0.61% 0.00%
PAGE_EXECUTE_READWRITE 49 0`000cc000 ( 816.000 kb) 0.05% 0.00%
PAGE_EXECUTE_WRITECOPY 23 0`00064000 ( 400.000 kb) 0.03% 0.00%
PAGE_READWRITE|PAGE_GUARD 17 0`00045000 ( 276.000 kb) 0.02% 0.00%
PAGE_EXECUTE 2 0`00003000 ( 12.000 kb) 0.00% 0.00%
--- Largest Region by Usage ----------- Base Address -------- Region Size ----------
Free 58`64f06000 79e`93dca000 ( 7.619 Tb)
<unclassified> 58`10fb0000 0`319e1000 ( 793.879 Mb)
Image 7fe`99dc0000 0`00c41000 ( 12.254 Mb)
Heap 58`63f41000 0`0033f000 ( 3.246 Mb)
Stack 58`0fe30000 0`0007b000 ( 492.000 kb)
NlsTables 7f6`f8e90000 0`00033000 ( 204.000 kb)
TEB 7f6`f8d74000 0`00002000 ( 8.000 kb)
ActivationContextData 58`0f760000 0`00004000 ( 16.000 kb)
CsrSharedMemory 7f6`f8d90000 0`00005000 ( 20.000 kb)
PEB 7f6`f8ec8000 0`00001000 ( 4.000 kb)
接下来观察托管堆的使用情况,
占用内存最多的事System.String?不过紧接着我们看到了System.Data.DataRow也在排行榜上。还是选择从DataRow入手,因为从DataRow入手容易操作。
0:000> !dumpheap -stat
Loading the heap objects into our cache.
------------------------------
Heap 0
total 139,823 objects
------------------------------
total 139,823 objects
Statistics:
MT Count TotalSize Class Name
……
0x000007fea3f62040 899 64,728 System.SByte[]
0x000007fea601c4e0 1,000 96,000 System.Data.DataRow
0x000007fea3f6f740 451 186,960 System.Collections.Hashtable+bucket[]
0x000007fea3f55a10 1,625 1,121,864 System.Object[]
0x000000580fa4ccf0 16,907 82,823,784 Free
0x000007fea3f67c20 105,196 746,280,920 System.String
Total 139,823 objects, Total size: 831,575,648
DataRow应该包含在DataTable里面,psscor2调试扩展上面!dumpdatatables命令就是为了查看datatable用的。
原来内存中有个100 * 1000的表格。
0:000> !dumpdatatables
Loading the heap objects into our cache.
DataTable Rows Columns DataSet nextRowID ColumnCount
-----------------------------------------------------------------------------------------------
0x581133df58 0x58110e66d8 0x58110e6598 0 1,001 100
Total 1 DataTable objects
我们来看看它真实的大小(算上实例里面包含的所有数据)-- 746,620,976 bytes,果然是它惹的祸。那不用说700MB的System.String也是拜它所赐了。
0:000> !objsize 0x581133df58
sizeof(000000581133df58) = 746,620,976 ( 0x2c808830) bytes(System.Data.DataTable)
这个出问题的DataTable具体是哪个呢?如果我们能够把它列名找出来,开发人员就应该明白是哪张表了。恰好psscor2有么个命令。
0:000> !DumpColumnNames 0x581133df58
Column Names for DataTable: 0x581133df58
Column0, Column1, Column2, Column3, Column4, Column5, Column6, Column7, Column8, Column9, Column10, Column11, Column12, Column13, Column14, Column15, Column16, Column17, Column18, Column19, Column20, Column21, Column22, Column23, Column24, Column25, Column26, Column27, Column28, Column29, Column30, Column31, Column32, Column33, Column34, Column35, Column36, Column37, Column38, Column39, Column40, Column41, Column42, Column43, Column44, Column45, Column46, Column47, Column48, Column49, Column50, Column51, Column52, Column53, Column54, Column55, Column56, Column57, Column58, Column59, Column60, Column61, Column62, Column63, Column64, Column65, Column66, Column67, Column68, Column69, Column70, Column71, Column72, Column73, Column74, Column75, Column76, Column77, Column78, Column79, Column80, Column81, Column82, Column83, Column84, Column85, Column86, Column87, Column88, Column89, Column90, Column91, Column92, Column93, Column94, Column95, Column96, Column97, Column98, Column99
Session或者Application State存储过多数据
我们已经发现了造成问题的大数据表是哪张表了,接下来我们不仅要查看这张表中为什么这么多数据,同时也可以查看一下所有的正在处理的请求,
可以看到请求已经返回了,线程号为xxx。
0:020> !aspxpages
Going to dump the HttpContexts found in the heap.
Loading the heap objects into our cache.
HttpContextTimeout Completed Running ThreadId ReturnCode Verb RequestPath+QueryString
0x0000001c31895728 19200 Sec no 47 Sec XXX 200 GET /HighMemory_DataTable.aspx
Total 1 HttpContext objects
观察一下这个datatable对象处在第二代
0:020> !do 0x581133df58
Name: System.Data.DataTable
MethodTable: 000007fe99dcb078
EEClass: 000007fe99c295c0
Size: 512(0x200) bytes
GC Generation: 2
为了确保该对象可以被成功的回收我们要确定没有对象对该对象保持引用。其中weakln可以忽略因为他不会最终影响GC对它的回收,但是我们看到存在strong/pinned的引用,这样GC就不会认为这是垃圾而进行回收了。在该引用路径上我们最终看到System.Web.HttpApplicationState,说明这个datatable被放在了applicationstate里。
0:020> !gcroot 0x581133df58
Note: Roots found on stacks may be false positives. Run "!help gcroot" for
more info.
Scan Thread 4 OSThread 874
Scan Thread 14 OSThread 1288
Scan Thread 16 OSThread 169c
Scan Thread 17 OSThread 590
Scan Thread 18 OSThread 1504
Scan Thread 11 OSThread 141c
Scan Thread 1 OSThread 1344
Scan Thread 9 OSThread 134c
Scan Thread 15 OSThread 14d4
DOMAIN(0000001C5C927360):HANDLE(WeakLn):a11d0:Root: 0000001c0005b340(System.Web.NativeFileChangeNotification)->
0000001c0005b308(System.Web.DirMonCompletion)->
0000001c0005af80(System.Web.DirectoryMonitor)->
0000001c0005afc0(System.Collections.Hashtable)->
0000001c000be808(System.Collections.Hashtable+bucket[])->
0000001c0008e160(System.Web.FileMonitor)->
0000001c0008e1b8(System.Collections.Specialized.HybridDictionary)->
0000001c0008e238(System.Collections.Specialized.ListDictionary)->
0000001c0008e268(System.Collections.Specialized.ListDictionary+DictionaryNode)->
0000001c0008c9b8(System.Web.HttpApplicationFactory)->
0000001c0008cc30(System.Web.HttpApplicationState)->
0000001c0008cd00(System.Collections.Hashtable)->
0000001c0008cd58(System.Collections.Hashtable+bucket[])->
0000001c31886ab0(System.Collections.Specialized.NameObjectCollectionBase+NameObjectEntry)->
0x581133df58 (System.Data.DataTable)
DOMAIN(0000001C5C927360):HANDLE(Strong):a1328:Root: 0000001c00032b30(System.Web.NativeFileChangeNotification)->
0000001c00032af8(System.Web.DirMonCompletion)->
0000001c00032778(System.Web.DirectoryMonitor)->
0000001c000327b8(System.Collections.Hashtable)->
0000001c00032810(System.Collections.Hashtable+bucket[])->
0000001c00032c08(System.Web.FileMonitor)->
0000001c00032c60(System.Collections.Specialized.HybridDictionary)->
0000001c00032ce0(System.Collections.Specialized.ListDictionary)->
0000001c00032d10(System.Collections.Specialized.ListDictionary+DictionaryNode)->
0000001c00030948(System.Web.FileChangesMonitor)->
0000001c00030ad8(System.Collections.Hashtable)->
0000001c00030b30(System.Collections.Hashtable+bucket[])->
0000001c0005af80(System.Web.DirectoryMonitor)->
0000001c0005afc0(System.Collections.Hashtable)->
0000001c000be808(System.Collections.Hashtable+bucket[])->
0000001c0008e160(System.Web.FileMonitor)->
0000001c0008e1b8(System.Collections.Specialized.HybridDictionary)->
0000001c0008e238(System.Collections.Specialized.ListDictionary)->
0000001c0008e268(System.Collections.Specialized.ListDictionary+DictionaryNode)->
0000001c0008c9b8(System.Web.HttpApplicationFactory)->
0000001c0008cc30(System.Web.HttpApplicationState)->
0000001c0008cd00(System.Collections.Hashtable)->
0000001c0008cd58(System.Collections.Hashtable+bucket[])->
0000001c31886ab0(System.Collections.Specialized.NameObjectCollectionBase+NameObjectEntry)->
0x581133df58 (System.Data.DataTable)
DOMAIN(0000001C5C927360):HANDLE(Pinned):a15f0:Root: 0000001c4000c908(System.Object[])->
0000001c0008c9b8(System.Web.HttpApplicationFactory)->
0000001c0008cc30(System.Web.HttpApplicationState)->
0000001c0008cd00(System.Collections.Hashtable)->
0000001c0008cd58(System.Collections.Hashtable+bucket[])->
0000001c31886ab0(System.Collections.Specialized.NameObjectCollectionBase+NameObjectEntry)->
0x581133df58 (System.Data.DataTable)
回顾源代码发现的确有这么一行代码将datatable存放在了application中。
void CreateLargeDataTables()
{
DataTable table = new DataTable();
for (int i = 0; i < 100; i++)
{
table.Columns.Add(string.Format("Column{0}", i));
}
for (int i = 0; i < 1000; i++)
{
DataRow row = table.NewRow();
for (int j = 0; j < 100; j++)
{
row[j] = string.Format("Column{0}Row{1}", i, j);
for (int k = 0; k < 100; k++)
{
row[j] += "StringConcatCauseHighMemoryAndHighCPU";
}
}
table.Rows.Add(row);
}
Application.Add("datatable", table);
}
Debug模式下的应用程序和库文件(DLL)
测试代码
using System;
public partial class HighMemory_Exceptions : System.Web.UI.Page
{
protected void Page_Load(object sender, EventArgs e)
{
GenerateExceptions();
}
void GenerateExceptions()
{
for (int i = 0; i < 1000; i++)
{
try
{
int result = i / (i - i);
}
catch
{
}
}
}
}
这里涉及到两种debug模式,第一是httpruntime的debug模式,该模式通过web.config配置控制,第二是assembly的debug模式,这个是编译这个assembly的时候选择的模式决定的。我们可以通过以下命令来查看,
同样加载psscor2调试扩展
0:019> .load psscor2接下来通过finddebugtrue命令来查看是否存在httpruntime的debug模式为true
0:019> !finddebugtrue
Loading the heap objects into our cache.
Debug set to true for Runtime: c2362920, AppDomain: C:\inetpub\webdebug\
Total 2 HttpRuntime objects
同样也可以通过!dumphttpruntime来查看
0:019> !dumphttpruntime
Going to dump the HttpRuntimes found in the heap.
Loading the heap objects into our cache.
HttpRuntime 0x000000a2c2362920:
_shutdownInProgress: 0
_requestQueue:
0x000000a2c24b7e08
_appDomainAppPath: C:\inetpub\webdebug\
_appDomainAppId: /LM/W3SVC/2/ROOT
_fcm:
…
…
_isOnUNCShare: 0
_debuggingEnabled: 1
-----------------
接下来通过finddebugmodules命令查找debug模式下编译出来的assembly。
0:019> !finddebugmodules
Loading all modules.
Searching for modules built in debug mode...
App_Web_1tu1wmur.dll not built release
Done Seaching
Part1里面我们详细讲过对于debug模式在生产环境中的危害,这里只是演示所以只有一个模块存在问题,而在真正的生产环境中如果没有注意的话,通过该命令去查看一下往往会有惊人的发现。
大量的异常抛出
检测异常信息可以通过内建的命令DumpAllExceptions(dae)来查看。不要惊讶看到内存里面有三个特殊的异常,在调试outofmemory中提到过这个异常时CLR启动就会创建起来的,如果他们的数字不为1的时候我们才需要注意。
0:019> !dae
Going to dump the .NET Exceptions found in the heap.
Loading the heap objects into our cache.
Number of exceptions of this type: 1
Exception MethodTable: 000007fe9c178218
Exception object: 000000a2c23201c0
Exception type: System.ExecutionEngineException
Message: <none>
InnerException: <none>
StackTrace (generated):
<none>
StackTraceString: <none>
HResult: 80131506
The current thread is unmanaged
-----------------
Number of exceptions of this type: 1
Exception MethodTable: 000007fe9c178108
Exception object: 000000a2c2320138
Exception type: System.StackOverflowException
Message: <none>
InnerException: <none>
StackTrace (generated):
<none>
StackTraceString: <none>
HResult: 800703e9
The current thread is unmanaged
-----------------
Number of exceptions of this type: 1
Exception MethodTable: 000007fe9c177ff8
Exception object: 000000a2c23200b0
Exception type: System.OutOfMemoryException
Message: <none>
InnerException: <none>
StackTrace (generated):
<none>
StackTraceString: <none>
HResult: 8007000e
The current thread is unmanaged
-----------------
Number of exceptions of this type: 1000
Exception MethodTable: 000007fe9c948e38
Exception object: 000000a2c2530b50
Exception type: System.DivideByZeroException
Message: Attempted to divide by zero.
InnerException: <none>
StackTrace (generated):
SP IP Function
000000A312DFE1B0 000007FE3D6709A6 HighMemory_Exceptions.GenerateExceptions()
StackTraceString: <none>
HResult: 80020012
The current thread is unmanaged
-----------------
Total 1,005 exceptions
另外也可以通过以下命令来查看异常的统计信息以及相应的大小,异常相对来是是一种相对昂贵的机制,频繁的异常发生不仅消耗内存同时消耗消耗CPU,所以如果可以通过修改代码避免掉异常发生,那就不要犹豫。
0:019> !dumpheap -stat -type *Exception
Loading the heap objects into our cache.
------------------------------
Heap 0
total 1,005 objects
------------------------------
total 1,005 objects
Statistics:
MT Count TotalSize Change Class Name
0x000007fe9c178218 1 136 0 System.ExecutionEngineException
0x000007fe9c178108 1 136 0 System.StackOverflowException
0x000007fe9c177ff8 1 136 0 System.OutOfMemoryException
0x000007fe9c178328 2 272 0 System.Threading.ThreadAbortException
0x000007fe9c948e38 1,000 136,000 0 System.DivideByZeroException
Total 1,005 objects, Total size: 136,680