Top 10 SQL Server 2005 Performance Issues for Data Warehouse and Reporting Applications

本文概述了SQL Server 2005数据仓库和报表应用程序中常见的十大性能问题,并提供了详细的解决方案,包括如何通过索引、内存管理和磁盘I/O等手段避免性能瓶颈。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

SQL Server 2005 Performance Troubleshooting Resources (http://blogs.technet.com/b/sqlman/archive/2008/12/13/sql-server-2005-performance-troubleshooting-resources.aspx)

 

Top 10 SQL Server 2005 Performance Issues for Data Warehouse and Reporting Applications

http://technet.microsoft.com/en-us/library/cc917690.aspx

 

Top 10 SQL Server 2005 Performance Issues for Data Warehouse and Reporting Applications
Published: November 6, 2006

Relational Data Warehouse or Reporting work loads are characterized by low volumes of very large transactions. These applications are often identified as having predominately read workloads (e.g. Decision Support, Analysis, and Reporting) with periodic feeds or batch loads. It is important to keep these characteristics in mind as we examine the significance of database design, resource utilization and system performance. The top performance bottlenecks or gotchas to avoid for Data Warehouse or Reporting applications are outlined below.

1 Database Design issue if….

  • Excessive sorting operations are performed. If you continually perform the same sorting operations over and over, you can avoid these with appropriate indexing.

  • Excessive RID lookups are performed on heap tables. RID lookups mean extra IOs are required to retrieve columns that are not in the index used. This can be avoided with covered nonclustered indexes.

  • Key lookups against the clustering keys look like joins however they are marked as “lookups” only in the XML showplan. These can be avoided with covered nonclustered indexes.

  • A potentially beneficial index is missing on join columns resulting in HASH joins. Indexes on join columns may avoid the hash.

2 CPU gotchas….

  • If signal waits > 25% of total waits, there is a CPU bottleneck. See sys.dm_os_wait_stats for Signal waits and Total waits. Signal waits measure the time spent in the runnable queue waiting for CPU. High signal waits indicate a CPU bottleneck.

  • Avoid inappropriate plan re-use. If the query is identical, then plan re-use is a good thing. However, query parameterization that allows plan re-use is only appropriate when the result set (and intermediate work tables) are of similar size to the original plan. If result set sizes vary significantly due to differing parameter values which are common in data warehouse scenarios, plan re-use can be detrimental. Bad plans can also lead to longer running queries and IO or memory pressure. Therefore, the cost of plan generation in such cases is preferable to plan re-use. Unlike OLTP, data warehouse queries are not always identical in terms of result sets or optimal query plans.

3 Memory bottleneck if….

  • Sudden big drop in page life expectancy. DW applications (e.g. big transactions) could experience big drops in page life expectancy. This is due to a cache flush from a big read. See Perfmon object SQL Server Buffer Manager.

  • Pending memory grants. See counter Memory Grants Pending, in the Perfmon object SQL Server Memory Manager. Large memory grants can be common in Data Warehouse applications. More memory may help, otherwise the user cannot execute until memory grant occurs.

  • Sudden drops or consistently low SQL Cache hit ratio. Drops or low cache hit may indicate memory pressure or missing indexes.

4 IO bottleneck if…

  • The best metric for write performance is disk seconds per read and disk seconds per write. When the IO system is NOT under significant load, there will be no disk queuing and thus disk seconds per read or write should be as good as it gets Normally it takes 4-8 milliseconds to complete a read when there is no IO pressure. Factors for IO throughput are the number of spindles, and drive throughput such as sequential and random IOs per second (according to the vendor). As the IO requests increase, you may notice disk queuing. The effects of queuing are reflected in high disk seconds per read or write. Periodic higher values for disk seconds/read may be acceptable for many applications. For high performance OLTP applications, sophisticated SAN subsystems provide greater IO scalability and resiliency in handling spikes of IO activity. Sustained high values for disk seconds/read (>15ms) does indicate a disk bottleneck.

  • High average disk seconds per write. See Perfmon Logical or Physical disk. Data Warehouse loads can be either logged with inserts, updates or deletes, or non-logged using bulk copy. Logged operations require transaction log writes. A transaction log write can be as fast as 1ms (or less) for high performance SAN environments. For many applications, a periodic spike in average disk seconds per write is acceptable considering the high cost of sophisticated SAN subsystems. However, sustained high values for average disk seconds/write is a reliable indicator of a disk bottleneck.

  • Big IOs such as table and range scans may be due to missing indexes.

5 Blocking bottleneck if….

  • Index contention. Look for high lock and latch waits in sys.dm_db_index_operational_stats. Compare with lock and latch requests.

  • High average row lock or latch waits. The average row lock or latch waits are computed by dividing lock and latch wait milliseconds (ms) by lock and latch waits. The average lock wait ms computed from sys.dm_db_index_operational_stats represents the average time for each block.

  • Block process report shows long blocks. See sp_configure “blocked process threshold” and Profiler “Blocked process Report” under the Errors and Warnings event.

  • High number of deadlocks. See Profiler “Graphical Deadlock” under Locks event to identify the statements involved in the deadlock.

6 Network bottleneck if….

  • High network latency coupled with an application that incurs many round trips to the database.

  • Network bandwidth is used up. See counters packets/sec and current bandwidth counters in the network interface object of Performance Monitor. For TCP/IP frames actual bandwidth is computed as packets/sec * 1500 * 8 /1000000 Mbps.

7 Wait statistics gotchas...

  • Since DataWarehouse and Reporting workloads are largely reads which are compatible with other reads, incompatible exclusive lock waits would ordinarily only come into play during batch loads or periodic feeds. If the top wait statistics are LCK_x. or PAGELATCH_EX, see “SQL Server 2005 Performance Tuning using Waits & Queues” for an explanation of sys.dm_os_wait_stats.

  • There is an IO bottleneck if top wait statistics in sys.dm_os_wait_stats are related to IO such as ASYNCH_IO_COMPLETION, IO_COMPLETION, LOGMGR, WRITELOG, or PAGEIOLATCH_x.

8 Indexing gotchas.

  • Large data warehouse can benefit from more indexes. Indexes can be used to cover queries and avoid sorting. For a data warehouse application, the cost of index overhead is only paid when data is loaded.

  • Check for missing indexes in sys.dm_db_missing_index_group_stats, sys.dm_db_missing_index_groups and sys.dm_db_missing_index_details

9 Watch out for fragmentation.

  • Excessive fragmentation is problematic for big IO operations. The Dynamic Management table valued function sys.dm_db_index_physical_stats returns the fragmentation percentage in the column avg_fragmentation_in_percent. Fragmentation should not exceed 25%. Reducing index fragmentation can benefit big range scans, common in data warehouse and Reporting scenarios

10 Consider Table Partitioning for fast loads

  • For the large tables common in Data Warehouses, table partitioning offers important performance and manageability advantages. For example, the fastest type of load is a non-logged bulk copy. The requirements for non-logged bulk copies are that indexes must be dropped. This is not feasible on a huge billion row table UNLESS you use table partitioning. This allows one to create a staging table identical to the large table (minus indexes). A fast non-logged bulk copy is used to load data. Thereafter, indexes are added to the staging table followed by constraints. Then, a meta-data only SWITCH IN operation switches pointer locations for the populated staging table and the empty target partition of the partitioned table resulting in an fully populated partition and empty staging table. Besides a fast bulk copy, the staging table allows us to eliminate blocking in the large partitioned table during the load. For more information refer to “Loading Bulk Data into Partitioned Tables”. In addition to fast loads, partitioned tables allow fast deletes (or archiving purposes or sliding window deletes) where large logged deletes are replaced with meta-data only partition SWITCH OUT operations that switches pointer locations for the full partition (to be ‘deleted’) and an empty monolithic table. The SWITCH OUT results in an empty partition and a fully populated monolithic staging table. Thereafter the monolithic table can either be dropped or added to a partitioned archive table using SWITCH IN. Partitions also provide manageability improvements when combined with specific filegroup placement, allowing for customized backup and restore strategies.

In contrast to OLTP, a data warehouse or Reporting application is characterized by a small number of large SELECT transactions, each very different. The implications are significant for database design, resource usage, and system performance. These distinctions result in very different objectives and resource utilization profiles.

内容概要:该论文聚焦于6G通信中20-100GHz频段的电磁场(EMF)暴露评估问题,提出了一种基于自适应可重构架构神经网络(RAWA-NN)的预测框架。该框架通过集成权重分析模块和优化模块,能够自动优化网络超参数,显著减少训练时间。模型使用70%的前臂数据进行训练,其余数据用于测试,并用腹部和股四头肌数据验证模型泛化能力。结果显示,该模型在不同参数下的相对差异(RD)在前臂低于2.6%,其他身体部位低于9.5%,可有效预测皮肤表面的温升和吸收功率密度(APD)。此外,论文还提供了详细的代码实现,涵盖数据预处理、权重分析模块、自适应优化模块、RAWA-NN模型构建及训练评估等内容。 适合人群:从事电磁兼容性研究、6G通信技术研发以及对神经网络优化感兴趣的科研人员和工程师。 使用场景及目标:①研究6G通信中高频段电磁暴露对人体的影响;②开发更高效的电磁暴露评估工具;③优化神经网络架构以提高模型训练效率和预测精度。 其他说明:论文不仅提出了理论框架,还提供了完整的代码实现,方便读者复现实验结果。此外,论文还讨论了未来的研究方向,包括扩展到更高频段(如300GHz)的数据处理、引入强化学习优化超参数、以及实现多物理场耦合的智能电磁暴露评估系统。建议读者在实际应用中根据具体需求调整模型架构和参数,并结合真实数据进行验证。
内容概要:本文是北京金融科技产业联盟发布的《基于数据空间的金融数据可信流通研究报告》,探讨了金融数据可信流通的现状、挑战和发展前景。文章首先介绍了金融数据在数字化转型中的重要性及其面临的隐私保护和安全挑战。接着,文章详细阐述了数据空间的概念及其发展历程,尤其是可信数据空间(TDM)在我国的发展情况。文中还深入分析了金融数据可信流通的典型应用场景、关键技术和方案架构,如数据访问控制、数据使用控制、智能合约、数据脱敏等。最后,文章展示了多个典型场景应用案例,如中信银行总分行数据流通管控、工银金租数据流通、银联安全生物特征支付等,并总结了当前可信数据空间建设中存在的法规、技术、标准和商业模式挑战,提出了相应的政策建议。 适用人群:金融行业从业者、数据安全管理人员、政策制定者、科技研发人员等。 使用场景及目标:①理解金融数据可信流通的重要性和挑战;②学习可信数据空间的关键技术和应用场景;③探索金融数据可信流通的具体实践案例;④了解当前可信数据空间建设的瓶颈和未来发展方向。 其他说明:本文不仅提供了详尽的技术和应用分析,还提出了具体的政策建议,有助于推动金融数据可信流通的健康发展。阅读本文可以帮助读者深入了解金融数据安全保护和高效利用的最佳实践,为相关政策和技术的发展提供参考。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值