SQL Server 2008效能故障排查(一)——概論

發糞塗牆發表於2012-07-05


備註:本人花了大量下班時間翻譯,絕無抄襲,允許轉載,但請註明出處。由於篇幅長,無法一篇博文全部說完,同時也沒那麼快全部翻譯完,所以按章節釋出。由於本人水平有限,翻譯結果肯定存在問題,為了不造成誤導,在每篇結尾處都附上原文,供大家參考,也希望能指出我的問題,以便改進。謝謝。

另外,本文寫給稍微有經驗的資料庫開發人員或者DBA看,初學者可能會看不懂。在此請見諒

作者:Sunil Agarwal, Boris Baryshnikov, KeithElmore, Juergen Thomas, Kun Cheng, Burzin Patel

技術評審:Jerome Halmans, Fabricio Voznika,George Reynya

釋出於:2009年3月

適用於:SQL Server 2008


概要:

有時候對一個工作負載進行劣質的資料庫設計或者不正確的系統配置會引起SQLServer執行緩慢。DBA需要主動地防止或者最小化問題,並且當問題發生後,診斷問題的起因並作出正確的響應。本文提供逐步指引,使用公開可用的工具如SQLServer Profiler、效能監視器、DMV、SQLServer擴充事件資料收集器來診斷和排查常見效能問題。

 

版權:本部分略去,請尊重他人勞動成果即可


簡介:

SQLServer偶爾執行緩慢是不常見的現象。一般原因可以歸結為:對一個工作負載進行劣質的資料庫設計或者不正確的系統配置。作為一個DBA,需要主動避免或者最小化問題。當問題發生時,需要去診斷起因和作出正確的對策。本白皮書提供了各種工具如SQLServer Profiler、效能監視器、DMV、SQLServer擴充事件和資料收集器來診斷和排查常見效能問題。本白皮書把問題範圍限制在一些客戶經常反映的地方,因為分析所有可能的問題是不現實的。

目標:

本文的主要目的是提供常規方法,主要是一些公開的、可用的工具用於診斷和故障排查。SQLServer 2008在支援性上有了重大的提升。新增了一些新動態管理檢視(DMV):如sys.dm_os_memory_brokers,sys.dm_os_memory_nodes,sys.dm_exec_procedure_stats。已有(2005出現)的DMV比如:sys.dm_os_sys_info,sys.dm_exec_requests和sys.dm_exec_requests也新增了很多新的資訊。你可以DMV和使用現有的工具如SQL Server Profiler、效能監視器來收集效能相關資料用於分析。
第二個目的是介紹新的故障排查工具和2008特性,包括擴充事件(Extended Events)和資料收集器(data collector)


方法論:

SQLServer執行緩慢的原因可能有很多種,本文中根據下面3個主要症狀來開始問題診斷:

  •  資源瓶頸(Resource Bottlenecks):CPU、記憶體和I/O瓶頸都將在本文中提及。我們不考慮網路問題。在每個資源瓶頸中,我們會描述如何標識出問題然後迭代地檢查可能的原因。比如,一個記憶體瓶頸會引起過多的頁面切換從而影響效能。
  • TempDB瓶頸:因為在每個SQLServer例項中,只有一個tempdb可被個個資料庫使用,所以它可能成為效能問題和硬碟空間瓶頸。一個應用可能因為過多的DDL或者DML操作,並且耗用過多資源,會使得tempdb超負荷。這能引起非相關的、執行在同一伺服器上的應用程式變得緩慢甚至執行失敗。
  • 一個執行緩慢的使用者查詢:一個已存在的查詢可能會影響效能,或者一個新的查詢會耗費比想象中更多的資源。一般由以下原因引起:
  1、一個現有查詢的統計資訊的改變會使得優化器選擇一個效能低下的執行計劃。
  2、丟失索引將導致強制表掃描和減慢查詢速度。
  3、應用程式也會因為阻塞從而影響效能,即使資源利用情況很正常。
  4、一些不好的應用程式、不合理的架構設計或者使用了不合適的事務隔離級別,都會導致過多的阻塞。
上面的這些原因不應該分開來分析,低效的執行計劃會加重系統資源的使用從而引起工作負載的效能總體下降。所以,如果一個大表丟失了一個有效的索引,或者查詢優化器不選擇使用這個索引,那麼查詢將非常慢。這些情況也同時會對I/O子系統的讀操作帶來很大壓力,因為不得不去讀取一些本來沒必要或者本來已經快取在記憶體中的頁。類似於一個經常執行的程式過度的編譯將為CPU帶來壓力。
  •  在SQL Server 2008中新的效能工具:SQLServer2008提供了新的工具和特性去協助你監控和故障排查。我們主要討論的是:擴充事件和資料收集器

資源瓶頸(Resource Bottlenecks):

在接下來的部分,將討論CPU、記憶體和I/O子系統資源,並且討論在什麼情況下它們會成為瓶頸(網路部分不在本文討論範圍內)。對於每個資源瓶頸,我們將討論如何識別問他你,然後迭代地檢查可能的原因。比如記憶體瓶頸將導致切換頁面過多,從而影響效能。
在你能判斷效能瓶頸之前,你必須知道在正常情況下資源是如何被利用的。你能使用本文描述的方法去收集效能基線。即在沒有效能問題之前的效能資料。
你可能發現資源使用正常,但是SQLServer在目前的配置下不能支援相應的負載。為了解決這個問題,你可能不得不增加更多更強大的資源,如記憶體、加大你目前I/O或者網路的頻寬。但是,在你執行之前,你有必要先了解資源瓶頸的常規起因。一些解決方案,如重新配置,而不一定非要增加資源。


解決資源瓶頸的工具:


下列工具中的一個或多個能在解決部分效能問題時使用到:
 效能監視器(Performance Monitor):在部分Windows 作業系統中提供,詳細的瞭解請查閱Windows文件。
 SQLServer Profiler:在SQLServer的效能工具組中可以找到,可以檢視聯機叢書瞭解。
 DBCC命令:可以檢視附錄A和聯機叢書瞭解。
 DMVs:詳細可檢視聯機叢書。
 擴充事件(Extended Events):可以檢視稍後提到的Extended Events部分和聯機叢書。
 資料收集器和管理資料倉儲(Data collector and the management data warehouse(MDW)):可以檢視稍後提及的Data collector and MDW部分及聯機叢書。

下一節:CPU瓶頸


原文:

Troubleshooting Performance Problems in SQL Server 2008
SQL Server Technical Article

Writers: Sunil Agarwal, Boris Baryshnikov, Keith Elmore, Juergen Thomas, Kun Cheng, Burzin Patel
Technical Reviewers: Jerome Halmans, Fabricio Voznika, George Reynya

Published: March 2009
Applies to: SQL Server 2008

Summary: Sometimes a poorly designed database or a system that is improperly configured for the workload can cause the slowdowns in SQL Server. Administrators need to proactively prevent or minimize problems and, when they occur, diagnose the cause and take corrective action. This paper provides step-by-step guidelines for diagnosing and troubleshooting common performance problems by using publicly available tools such as SQL Server Profiler, Performance Monitor, dynamic management views, and SQL Server Extended Events (Extended Events) and the data collector, which are new in SQL Server 2008.


Copyright

The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication.

This white paper is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED, OR STATUTORY, AS TO THE INFORMATION IN THIS DOCUMENT.

Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in, or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation.

Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property.

Unless otherwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place, or event is intended or should be inferred.


© 2009 Microsoft Corporation. All rights reserved.


Microsoft, MSDN, SQL Server, Win32, Windows, Windows Server, and Windows Vista are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.


All other trademarks are property of their respective owners.




Table of Contents
Introduction 1
Goals 1
Methodology 1
Resource Bottlenecks 2
Tools for Resolving Resource Bottlenecks 2
CPU Bottlenecks 3
Excessive Query Compilation and Optimization 4
Detection 5
Resolution 7
Unnecessary Recompilation 9
Detection 10
Resolution 13
Inefficient Query Plan 14
Detection 15
Resolution 15
Intraquery Parallelism 16
Detection 18
Resolution 21
Poor Cursor Usage 21
Detection 22
Resolution 23
Memory Bottlenecks 23
Background 23
Virtual Address Space and Physical Memory 23
AWE, Locked Pages, and SQL Server 23
Memory Pressures 25
Detecting Memory Pressures 26
Tools for Memory Diagnostics 26
New DMVs in SQL Server 2008 27
Resource Governor in SQL Server 2008 27
External Physical Memory Pressure 28
External Virtual Memory Pressure 30
Internal Physical Memory Pressure 30
Caches and Memory Pressure 36
Ring Buffers 37
Internal Virtual Memory Pressure 43
General Troubleshooting Steps in Case of Memory Errors 44
Memory Errors 44
I/O Bottlenecks 48
Resolution 52
tempdb 56
Monitoring tempdb Space 58
Troubleshooting Space Issues 59
User Objects 59
Version Store 60
Internal Objects 62
Excessive DDL and Allocation Operations 65
Resolution 66
Slow-Running Queries 66
Blocking 67
Locking Granularity and Lock Escalation 69
Identifying Long Blocks 71
Blocking per Object with sys.dm_db_index_operational_stats 74
Overall Performance Effect of Blocking Using Waits 75
Monitoring Index Usage 78
Extended Events 80
Data Collector and the MDW 88
Appendix A: DBCC MEMORYSTATUS Description 95
Appendix B: MDW Data Collection 96



Introduction
It’s not uncommon to experience the occasional slowdown of a database running the Microsoft® SQL Server® database software. The reasons can range from a poorly designed database to a system that is improperly configured for the workload. As an administrator, you want to proactively prevent or minimize problems; if they occur, you want to diagnose the cause and take corrective actions to fix the problem whenever possible. This white paper provides step-by-step guidelines for diagnosing and troubleshooting common performance problems by using publicly available tools such as SQL Server Profiler; System Monitor (in the Windows Server® 2003 operating system) or Performance Monitor (in the Windows Vista® operating system and Windows Server 2008), also known as Perfmon; dynamic management views (sometimes referred to as DMVs); and SQL Server Extended Events (Extended Events) and the data collector, which are new in SQL Server 2008. We have limited the scope of this white paper to the problems commonly seen by Microsoft Customer Service and Support, because an exhaustive analysis of all possible problems is not feasible.
Goals
The primary goal of this paper is to provide a general methodology for diagnosing and troubleshooting SQL Server performance problems in common customer scenarios by using publicly available tools.
SQL Server 2008 has made great strides in supportability. New dynamic management views (DMVs) have been added, like sys.dm_os_memory_brokers, sys.dm_os_memory_nodes, and sys.dm_exec_procedure_stats. Existing DMVs such as sys._dm_os_sys_info, sys.dm_exec_requests, and sys.dm_exec_requests have been enriched with additional information. You can use DMVs and existing tools, like SQL Server Profiler and Performance Monitor, to collect performance related data for analysis.
The secondary goal of this paper is to introduce new troubleshooting tools and features in SQL Server 2008, including Extended Events and the data collector.
Methodology
There can be many reasons for a slowdown in SQL Server. We use the following three key symptoms to start diagnosing problems:
• Resource bottlenecks: CPU, memory, and I/O bottlenecks are covered in this paper. We do not consider network issues. For each resource bottleneck, we describe how to identify the problem and then iterate through the possible causes. For example, a memory bottleneck can lead to excessive paging that ultimately impacts performance.
• tempdb bottlenecks: Because there is only one tempdb for each SQL Server instance, it can be a performance and a disk space bottleneck. An application can overload tempdb through excessive DDL or DML operations and by taking too much space. This can cause unrelated applications running on the server to slow down or fail.
• A slow-running user query: The performance of an existing query might regress, or a new query might appear to be taking longer than expected. There can be many reasons for this. For example:
o Changes in statistical information can lead to a poor query plan for an existing query.
o Missing indexes can force table scans and slow down the query.
o An application can slow down due to blocking even if resource utilization is normal.
o Excessive blocking can be due to poor application or schema design or the choice of an improper isolation level for the transaction.
The causes of these symptoms are not necessarily independent of each other. The poor choice of a query plan can tax system resources and cause an overall slowdown of the workload. So, if a large table is missing a useful index, or if the query optimizer decides not to use it, the query can slow down; these conditions also put heavy pressure on the I/O subsystem to read the unnecessary data pages and on the memory (buffer pool) to store these pages in the cache. Similarly, excessive recompilation of a frequently-run query can put pressure on the CPU.
New Performance Tools in SQL Server 2008
SQL Server 2008 introduced new features and tools that you can use to monitor and troubleshoot performance problems. We’ll discuss two features: Extended Events and the data collector.
Resource Bottlenecks
The next sections of this paper discuss CPU, memory, and I/O subsystem resources and how these can become bottlenecks. (Network issues are outside of the scope of this paper.) For each resource bottleneck, we describe how to identify the problem and then iterate through the possible causes. For example, a memory bottleneck can lead to excessive paging, which can ultimately impact performance.
Before you can determine whether you have a resource bottleneck, you need to know how resources are used under normal circumstances. You can use the methods outlined in this paper to collect baseline information about the use of the resource (at a time when you are not having performance problems).
You might find that the problem is a resource that is running near capacity and that SQL Server cannot support the workload in its current configuration. To address this issue, you may need to add more processing power or memory, or you may need to increase the bandwidth of your I/O or network channel. However, before you take that step, it is useful to understand some common causes of resource bottlenecks. Some solutions, such as reconfiguration, do not require the addition of more resources.
Tools for Resolving Resource Bottlenecks
One or more of the following tools can be used to resolve a particular resource bottleneck:
• Performance Monitor: This tool is available as part of the Windows® operating system. For more information, see your Windows documentation.
• SQL Server Profiler: See SQL Server Profiler in the Performance Tools group in the SQL Server 2008 program group. For more information, see SQL Server 2008 Books Online.
• DBCC commands: For more information, see SQL Server 2008 Books Online and Appendix A.
• DMVs: For more information, see SQL Server 2008 Books Online.
• Extended Events: For more information, see Extended Events later in this paper and SQL Server 2008 Books Online.
• Data collector and the management data warehouse (MDW): For more information, see Data Collector and the MDW later in this paper and SQL Server 2008 Books Online.

相關文章