在windows envent log 中得到很多相关的信息(不是error):
Process ID 59:0 owns resources that are blocking processes on Scheduler 2
Process ID 5:26 owns resources that are blocking processes on Scheduler 5
Process ID 100:106 owns resources that are blocking processes on Scheduler 11
在SQL Server 2000 log中看到的错误信息有:
Error: 17884, Severity: 1, State: 0 Potential deadlocks exist on all the schedulers.
Process ID 59:0 owns resources that are blocking processes on Scheduler 2
Process ID 5:26 owns resources that are blocking processes on Scheduler 5
Process ID 100:106 owns resources that are blocking processes on Scheduler 11
当不停报这种信息的时候,数据库就会挂起,登入不了 。
查看process 发现由一个程序对应很多个一样的spid (大概有40多个相同的spid), wait type 是 LATCH_EX , wait resource 是 PARALLEL_PAGE_SUPPLIER(42E3B694) , 每个占用的I/O及Mem一样 。
初步怀疑是并行查询导致的问题 。 需要查看这个程序对应的可疑的procedure(太多代码,正在查看中) 。
sp_configure查看了 'max degree of parallelism' 设置的是 32 ,0 ,0 , 0表示有多少CPU资源就用多少,这里更改为 1 ,是它取消生成并行计划 , 命令如下:
sp_configure 'show advanced options', 1
RECONFIGURE
GO
sp_configure 'max degree of parallelism', 1
RECONFIGURE
GO
sp_configure 'allow updates', 1
RECONFIGURE WITH OVERRIDE
GO
最后再運行
sp_configure 'allow updates', 0
RECONFIGURE WITH OVERRIDE
GO
没有再出现刚才那种现象 (当然不清楚是不是设置生效了,需要观察) 。
其他分析 :
http://sqlforums.windowsitpro.com/web/forum/messageview.aspx?catid=74&threadid=85015&enterthread=y
need to run profiler for the duration of the period you think you will get this problem, monitoring the following
SQLBatch Completed:
RPC Completed:
Include the columns read/write/readcycles/duration/starttime/endtime/hostname/textdata/spid
the next time it happens note the time obtained from the errorlog and the spid as per the message,
then go into the profiler file and filter by the spid and the starttime, this will give you the process that spawned multiple threads in parallel thus causing the blocking on the the UMS.
If the problem is caused by a query plan using a parallel plan with and there are bad statistics then update the stats with fullscan/100percent sample(note that this issues a shared lock on the table and can take a long time to complete) or U can re-index(which will rebuild the stats) the table or force the specific query to use the query hint option(maxdop 1) .
if the problem is caused by a non-yielding log on authentication request then there is a fix in 8.00.2174 (assuming it is a sql 2000 server)
================================================================================
在企业管理器中看到的多个并行查询的等待信息 : PARALLEL_PAGE_SUPPLIER
这些 Latch_XX 闩锁用于同步为并行查询的页面的检索。 此闩锁指示查询并行运行的并且可能是典型。
如果 waittime 列是一个较大的值,并且查询欠佳,请调整查询。
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/35489/viewspace-608758/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/35489/viewspace-608758/
本文针对SQL Server 2000在并行查询过程中遇到的死锁问题进行了详细分析。通过修改'maxdegreeofparallelism'配置参数为1来避免生成并行查询计划,同时提供了使用SQL Profiler定位问题的具体步骤。此外还提出了更新统计信息或重建索引等解决方案。
5938

被折叠的 条评论
为什么被折叠?



