Oracle Performance Tuning 11g2 (5-1)

yuntui發表於2016-11-03

5.2 Overview of the Automatic Workload Repository AWR簡介

The Automatic Workload Repository (AWR) collects, processes, and maintains performance statistics for problem detection and self-tuning purposes. This data is both in memory and stored in the database. The gathered data can be displayed in both reports and views.

為了診斷問題和自動調優,AWR做統計資料的收集、處理和維護工作。這些統計資料是儲存在記憶體及表空間中的。收集的統計資料可以透過報表的方式或者檢視的方式展現出來。

The statistics collected and processed by AWR include: AWR收集和處理的統計資料包含如下

  • Object statistics that determine both access and usage statistics of database segments 【物件統計資料,段的訪問和使用統計】
  • Time model statistics based on time usage for activities, displayed in the V$SYS_TIME_MODEL and V$SESS_TIME_MODEL views
  • Some of the system and session statistics collected in the V$SYSSTAT and V$SESSTAT views
  • SQL statements that are producing the highest load on the system, based on criteria such as elapsed time and CPU time 【v$sql】
  • ASH statistics, representing the history of recent sessions activity 【ASH統計,表示最近歷史活躍session統計】

Gathering database statistics using the AWR is enabled by default and is controlled by the STATISTICS_LEVEL initialization parameter. The STATISTICS_LEVEL parameter should be set to the TYPICAL or ALL to enable statistics gathering by the AWR. The default setting is TYPICAL. Setting STATISTICS_LEVEL to BASIC disables many Oracle Database features, including the AWR, and is not recommended. If STATISTICS_LEVEL is set to BASIC, you can still manually capture AWR statistics using the DBMS_WORKLOAD_REPOSITORY package. However, because in-memory collection of many system statistics—such as segments statistics and memory advisor information—will be disabled, the statistics captured in these snapshots may not be complete.

使用AWR收集資料庫統計資訊預設是開啟的,這個功能是由statistics_level引數控制的。這個引數應該被設定成TYPICAL或者ALL以允許AWR去收集。預設是TYPICAL。 如果將statistics_level設定成basic的話,將禁用掉很多的oracle特性,包括AWR功能,因此是不建議設定此值。 但是即使你將STATISTICS_LEVEL設定成BASIC,你也可以自動的使用DBMS_WORKLOAD_REPOSITORY包去收集AWR資料。但是因為記憶體中的許多統計收集---比如段統計和記憶體建議資訊---是沒有啟用的,所以在手動捕捉的統計資料中是不完整的。(總的來說,千萬別把統計資料功能停掉)

 

5.2.1 Snapshots

Snapshots are sets of historical data for specific time periods that are used for performance comparisons by ADDM. By default, Oracle Database automatically generates snapshots of the performance data once every hour and retains the statistics in the workload repository for 8 days. You can also manually create snapshots, but this is usually not necessary. The data in the snapshot interval is then analyzed by the Automatic Database Diagnostic Monitor (ADDM).

AWR compares the difference between snapshots to determine which SQL statements to capture based on the effect on the system load. This reduces the number of SQL statements that must be captured over time.

快照是某一時間段內的一組歷史資料集合,它們被ADDM用來進行效能比較。預設情況下資料庫會自動的每1小時產生一個快照,並且保留8天時間。你也可以手動的產生快照,但是這通常是沒必要的。相隔快照(2個或者多個快照)中的資料被ADDM用來做分析使用。

AWR比較快照間的差異,以基於系統的負荷判斷哪些SQL語句去捕捉。這就減少了定期去捕捉SQL語句的必要性。

【注:也就是說為了收集歷史的資料,那該如何做呢?時時刻刻去收集?這不太現實的,其實ASH已經做這事情了,但是由於量太大,所以很多資料都丟了。但是很多時間我們需要一個大概的概念,比如剛才那一會兒(可能是10分鐘,半小時)都有多少人登入之類的,這就需要一個時間段的資訊。

在資料庫的所有活動中,資料已經實時的累積到相關的v$檢視中了。

為了方便後繼的效能問題研究,我們需要詳細的統計資料以及診斷資訊,卻又不能將每一個session的工作都全部詳細的儲存,因此需要做一個折衷的方案就是每隔1小時去看一下,這樣在資料庫中儲存的統計資訊就大大減少了。這個每隔多久去收集一次看自己的情況,比如你的系統高峰期時間就是30小時,那就半小時去收集一次。如果我每次說那個半小時收集的資料要怎麼怎麼,估計大家要瘋,因此我們需要一個簡單的詞來代表那半小時收集的資料,這就是“快照”。以後說到快照,就指那半小時內收集的資料。我們自己自動的去收集一下資料,即做個快照。

下圖是我在oem中截的圖,可以看到預設的為8天,interval時間為60分鐘即每小時收集一次,收集的級別(statistic_level為typical)

                          clip_image002

 

 

5.2.2 Baselines

A baseline contains performance data from a specific time period that is preserved for comparison with other similar workload periods when performance problems occur. The snapshots contained in a baseline are excluded from the automatic AWR purging process and are retained indefinitely.

基線包含了一個指定時間內的效能資料,它們被儲存起來,當效能出現問題的時候,將那段時間內差的效能資料與基線進行對比(也就是說找到一個標準,所有都和標準比較,無規律無以成方圓)。基線中的快照資訊是不會被AWR的自動清理機制刪除掉的,它們會被無限的保留起來,除非你自己手動去刪除。

【剛才我們已經瞭解到一段時間內的效能資料我們稱之為快照,那已經有了快照,為什麼搞這個基線?難道又要再收集?已經做的事情當然不能再做了,----所以基線是由快照組成的】

There are several types of available baselines in Oracle Database: 資料庫中有許多型別的可用基線

  • Fixed Baselines 固定基線
  • Moving Window Baseline 移動視窗基線
  • Baseline Templates 基線模板

5.2.2.1 Fixed Baselines

A fixed baseline corresponds to a fixed, contiguous time period in the past that you specify. Before creating a fixed baseline, carefully consider the time period you choose as a baseline, because the baseline should represent the system operating at an optimal level. In the future, you can compare the baseline with other baselines or snapshots captured during periods of poor performance to analyze performance degradation over time.

固定基線對應著一個你指定的過去固定,連續的時間段。在建立固定基線前,考慮好你要將哪個時間段作為基線,因為這個基線是要作為你系統執行在最佳狀態的歷史見證。在未來,當效能不行的時候你需要去拿這個固定基線去和其他的基線或者快照進行比較的。

5.2.2.2 Moving Window Baseline

A moving window baseline corresponds to all AWR data that exists within the AWR retention period. This is useful when using adaptive thresholds because the database can use AWR data in the entire AWR retention period to compute metric threshold values.

Oracle Database automatically maintains a system-defined moving window baseline. The default window size for the system-defined moving window baseline is the current AWR retention period, which by default is 8 days. If you are planning to use adaptive thresholds, consider using a larger moving window—such as 30 days—to accurately compute threshold values. You can resize the moving window baseline by changing the number of days in the moving window to a value that is equal to or less than the number of days in the AWR retention period. Therefore, to increase the size of a moving window, you must first increase the AWR retention period accordingly.

移動視窗基線對應著所有的AWR保留期內的AWR資料。當使用自適應閾值時非常有用,因為資料庫可以使用所有的AWR資料計算度量閾值

資料庫自動的維護著一個system-defined的移動視窗基線。這個預設的system-defined移動視窗基線的視窗大小是當前AWR保留期,也就是說預設為8天。如果計劃使用自適應閾值,則可考慮使用較大的移動視窗(如30天),以便精確地計算閾值。透過將移動視窗中的天數更改為等於或小於AWR保留期中的天數的值,可以調整移動視窗基線的大小。因此,要增加移動視窗的大小,需要先相應地增加AWR保留期。。(這裡有兩個視窗,一個是移動視窗的天數,這個視窗是移動視窗基線要用的,另一個AWR保留的天數。移動視窗的大小不能超過AWR的保留期,這個很明顯了,超過之後就沒有資料供其分析了,所以移動視窗一定要小於AWR保留期的。移動視窗基線是在移動視窗中執行的,那麼移動視窗天數變短時,移動視窗基線自然就短了。同時要增加移動視窗,AWR保留期也要加大才行,這裡要好好理一理,我看了好幾遍才理解!)

 

5.2.2.3 Baseline Templates

You can also create baselines for a contiguous time period in the future using baseline templates. There are two types of baseline templates: single and repeating.

You can use a single baseline template to create a baseline for a single contiguous time period in the future. This technique is useful if you know beforehand of a time period that you intend to capture in the future. For example, you may want to capture the AWR data during a system test that is scheduled for the upcoming weekend. In this case, you can create a single baseline template to automatically capture the time period when the test occurs.

You can use a repeating baseline template to create and drop baselines based on a repeating time schedule. This is useful if you want Oracle Database to automatically capture a contiguous time period on an ongoing basis. For example, you may want to capture the AWR data during every Monday morning for a month. In this case, you can create a repeating baseline template to automatically create baselines on a repeating schedule for every Monday, and automatically remove older baselines after a specified expiration interval, such as one month.

你可以使用基線模板在未來的某一時間段裡建立一個基線。有兩種型別的基線模板:單一的和重複的。

單一基線模板:用於在未來某一時間段內建立一個基線。這種技術對於現在你已經知道要在比如晚上3點鐘去建立一個基線時非常有用。例如你想在週末要到來的壓力測試中收集AWR資料時(這時候可以不用加班直接收集了,不過在國內很難啊,一樣要加班的啊;但是如果想要在是生產系統,在知道週末有壓力高峰就可以這麼做了),這種情況下你就可以建立一個基線模板,然後在未來的那個時間段內讓系統自動的收集一下。(假如壓力高峰突然從下午3點改成5點,那就掛了)

重複基線模板:用於重複時間段呼叫中自動的建立和刪除基線。這種技術對於你想讓資料庫自動的捕捉一個連續時間段在一個既定需求中。例如,你想去捕捉一個月的每週一早晨的AWR資料(假如一月有4周的話,就是要收集4次的)。在這種情況下,你可以建立一個重複基線模板在每週一早晨去自動的建立基線,並且自動地在基線超過了設定的時間間隙後刪除掉舊的基線,比如在一個月後。

以前,只能針對已經存在的快照建立基線。在11g中,晚上MMON任務會遍歷所有模板以生成基線,並會進行檢查以瞭解過去一天是否有任何時段從將來時間變成了過去時間(也就是說,明天才計劃今天的東西,如果要在週末建立,那麼只有到週一時才會發現有需要建立的模組,並以此建立基線)。對於那些相關的時段,MMON 任務隨後會針對該時段建立一個基線。

 

5.2.3 Adaptive Thresholds

Adaptive thresholds enable you to monitor and detect performance issues while minimizing administrative overhead. Adaptive thresholds can automatically set warning and critical alert thresholds for some system metrics using statistics derived from metric values captured in the moving window baseline. The statistics for these thresholds are recomputed weekly and might result in new thresholds as system performance evolves over time. In addition to recalculating thresholds weekly, adaptive thresholds might compute different thresholds values for different times of the day or week based on periodic workload patterns.

For example, many databases support an online transaction processing (OLTP) workload during the day and batch processing at night. The performance metric for response time per transaction can be useful for detecting degradation in OLTP performance during the day. However, a useful OLTP threshold value is almost certainly too low for batch workloads, where long-running transactions might be common. As a result, threshold values appropriate to OLTP might trigger frequent false performance alerts during batch processing. Adaptive thresholds can detect such a workload pattern and automatically set different threshold values for the daytime and nighttime.

自適應閾值使你以最小的管理壓力去監控和偵測效能問題。自適應閾值可以根據一些系統的度量值自動地設定警告和嚴重警告閾值,這些度量值是根據移動視窗基線捕捉到的一些度量值推匯出來的。這些閾值統計資料每週會重新計算一次,隨著系統演化,這就產生新的系統效能的閾值。除了每週重新計算閾值,自適應閾值可以從基於定期的壓力模型中以天或者周的不同的時間段計算不同的閾值。(也就是說白天和晚上是不同的壓力模型,因為在以天為單位時可以使用不同的閾值,每週末可能出報表,因此一週也有不同的閾值)

比如,許多資料庫支援白天時處理OLTP壓力,晚上處理批次系統。每事務處理的響應時間度量值對於診斷OLTP效能下降是有用的。然而,這個OLTP的閾值對於批次卻是太小了,因為批次很多時候需要長時間執行的。因此,OLTP的閾值將在處理批次時到處報警效能問題。自適應閾值可以診斷這個壓力模式,並且在白天和晚上自動的設定不同的閾值。

Note:

In Oracle Database 11g Release 2 (11.2), Oracle Database automatically determines the appropriate time groupings for a database. However, before Oracle Database 11g Release 2 (11.2), time groupings were specified manually by the database administrator.

在11.2以後的版本中,資料庫會自動的判斷相應時間組,而在11.2以前的版本中,時間組的劃分是由管理員做的。

                            clip_image004

There are two types of adaptive thresholds: 有兩種型別的自適應閾值:(其實還包括一個“固定值”,共3個)

  • Percentage of maximum: The threshold value is computed as a percentage multiple of the maximum value observed for the data in the moving window baseline.

最大百分比: 閾值被計算作為一個百分比,乘以從移動視窗基線中獲取的最大值。

  • Significance level: The threshold value is set to a statistical percentile that represents how unusual it is to observe values above the threshold value based the data in the moving window baseline. Specify one of the following percentiles:

重要級別: 閾值被設定成一個統計百分比,代表與從移動視窗基線中觀察到的閾值有多少不同。指定以下幾種比率:

    • High (.95): Only 5 in 100 observations are expected to exceed this value. 一般高:5% 可以超出這個值
    • Very High (.99): Only 1 in 100 observations are expected to exceed this value. 非常高:1% 可以超出這個值
    • Severe (.999): Only 1 in 1,000 observations are expected to exceed this value. 嚴重的:0.1% 可以走出這個值
    • Extreme (.9999): Only 1 in 10,000 observations are expected to exceed this value.極嚴重:0.01%可以走出這個值

Note:

When you specify Severe (.999) or Extreme (.9999), Oracle Database performs an internal calculation to set the threshold value. In some cases, Oracle Database cannot establish the threshold value at these levels using the data in the baseline, and the significance level threshold is not set.

If you are not receiving alerts as expected, and you specified a Severe (.999) or Extreme (.9999) significance level threshold, then you can try setting the significance level threshold to a lower value, such as Very High (.99) or High (.95). Alternatively, you can set a percentage of maximum threshold instead of a significance level threshold. If you change the threshold and find that you are receiving too many alerts, then you can try increasing the number of occurrences to cause an alert.

當你指定0.1%或0.01%時,資料庫會內部計算一個閾值。在這種情況下,資料庫不能以基線中的資料在這種級別下建立閾值,並且重要級別沒有被設定。

假如你沒有接收到期待的警告,並且你指定了0.1%或0.01%重要級別閾值,那麼你可以將這個重要級別閾值設定的低一些,例如1%或者5%。同樣地,你可以使用最大閾值百分比而不是使用重要級別閾值。假如你改變閾值後發現接收到太多的警告,那麼你可以增加發生警告的值。

Percentage of maximum thresholds are most useful when a system is sized for peak workloads, and you want to be alerted when the current workload volume is approaching or exceeding previous high values. Metrics that have an unknown but definite limiting value are good candidates for these settings. For example, the redo generated per second metric is typically a good candidate for a percentage of maximum threshold.

Significance level thresholds are most useful for metrics that should exhibit statistically stable behavior when the system is operating normally, but might vary over a wide range when the system is performing poorly. For example, the response time per transaction metric should be stable for a well-tuned OLTP system, but may fluctuate widely when performance issues arise. Significance level thresholds are meant to generate alerts when conditions produce both unusual metric values and unusual system performance.

最大百分比閾值是對於固定的高峰壓力,並且希望當目前的壓力量超過前一個高峰值時想得到警告時比較有用。對於未知的度量值,但是有其限度的值是這些設定的一個很好候選。例如當每秒鐘產生的日誌就可以作為最大閾值的一個候選值。

重要級別閾值在系統執行正常時顯示其穩定的統計行為度量值是非常有用的,但是當系統效能差的時候變化幅度很大。例如事務的響應時間度量在OLTP系統中應該是一個相對穩定的值,但是當效能出現問題時波動很大。重要級別閾值是當不尋常的度量值 和不尋常的系統效能發生時所產生的警告才是有意義的。

Note:

The primary interface for managing baseline metrics is Oracle Enterprise Manager. To create an adaptive threshold for a baseline metric, use Oracle Enterprise Manager

在OEM中,其實是有三種型別的: significance level, percentage of Maximum, Fixed Values。

clip_image006

下圖是All Metrics中擷取的一部分,這個頁面非常長,說明 metrics非常的多

clip_image008

下圖是我根據: Basic and Additional Metrics 擷取的一部分圖,這裡只有三大類: Performance Metrics, Workload Volume Metrics, Workload Type Metrics

clip_image010

5.2.4 Space Consumption

The space consumed by the AWR is determined by several factors: AWR消耗的空間由以下因素決定:

  • Number of active sessions in the system at any given time 在某一時間內active session數
  • Snapshot interval 快照間隙

The snapshot interval determines the frequency at which snapshots are captured. A smaller snapshot interval increases the frequency, which increases the volume of data collected by the AWR. 快照間隙決定的快照捕捉的頻率。快照間隙越小,頻率越高,因此就增加了AWR收集的資料量

  • Historical data retention period 歷史資料保留期限

The retention period determines how long this data is retained before being purged. A longer retention period increases the space consumed by the AWR.

保留期限決定了資料在被清除前保留的期限。保留的時間越長自然就佔用的空間越大。

By default, snapshots are captured once every hour and are retained in the database for 8 days. With these default settings, a typical system with an average of 10 concurrent active sessions can require approximately 200 to 300 MB of space for its AWR data. It is possible to change the default values for both snapshot interval and retention period.

The AWR space consumption can be reduced by the increasing the snapshot interval and reducing the retention period. When reducing the retention period, note that several Oracle Database self-managing features depend on AWR data for proper functioning. Not having enough data can affect the validity and accuracy of these components and features, including:

預設情況下,每小時捕捉一次快照,並且保留8天時間。 使用這個預設設定的話,一個平均10個活躍併發會話的系統,AWR大約會佔200到300M的空間(即8天佔300M,那麼1個月大約就是1.2G左右;同時一天大約使用300M/8 = 40M,而1小時收集一次,那一共是24,40M/24 = 1.5M左右,即每一次收集就大約佔用1.5M左右的空間,這個收集的量還是比較大的)。可以透過dbms包去改變這個快照的間隔和保留時間。

可以透過增加快照的間隙、減少快照的保留時間,達到減少AWR空間消耗目的。當減少快照保留期限時需要注意,許多oracle的自管理特性都是依賴於AWR的資料才能工作的(比如ADDM)。如果沒有足夠的AWR資料將影響許多功能模組精確的驗證分析,包括:

  • Automatic Database Diagnostic Monitor 【ADDM】
  • SQL Tuning Advisor 【SQL Tuning Advisor】
  • Undo Advisor 【Undo Advisor】
  • Segment Advisor 【Segment Advisor】

If possible, Oracle recommends that you set the AWR retention period large enough to capture at least one complete workload cycle. If your system experiences weekly workload cycles, such as OLTP workload during weekdays and batch jobs during the weekend, you do not need to change the default AWR retention period of 8 days. However if your system is subjected to a monthly peak load during month end book closing, you may have to set the retention period to one month.

Under exceptional circumstances, you can turn off automatic snapshot collection by setting the snapshot interval to 0. Under this condition, the automatic collection of the workload and statistical data is stopped and much of the Oracle Database self-management functionality is not operational. In addition, you cannot manually create snapshots. For this reason, Oracle strongly recommends that you do not turn off automatic snapshot collection.

假如可能的話,oracle建議將AWR的保留期限設定成足夠完成一個壓力週期的值。假如你的系統一般是一週的一個壓力週期,比如每天主要是OLTP處理,而在週末處理批次程式,你就不需要去更改預設的8天期限。但是如果你的系統是要月底進行賬目結算的月高峰壓力模式,你就應該把這個期限改成1個月。

在異常環境下,透過將快照的間隙設定成0以關閉自動快照收集功能。在這種情況下,自動壓力和統計資料的收集將停止工作,而且多數的資料庫自管理功能將無法工作。在這種情況下,你不能手動建立快照。因此oracle強烈的建議你不要去關閉自動快照收集功能

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/30633755/viewspace-2127720/,如需轉載,請註明出處,否則將追究法律責任。

相關文章