[RAC]ORACLE Database 10g RAC for Administrators學習筆記(三)

chenyan995發表於2009-01-06

6. RAC Performance Tuning

CPU and Wait Time Tuning Dimensions

一般情況下,等待事件顯著的話,比較可能需要instance/RAC tuning。這時增加CPU或節點很難改善問題。CPU使用率較高的話,有可能需要進行SQL tuning。這種情況下增加CPU或節點數一般可以得到較好的效果。(注意scale-upspeed-up的區別)

RAC-Specific Tuning

Tune for a single instance first.

Tune for RAC:

Instance recovery

Interconnect traffic

Point of serialization can be exacerbated

RAC-reactive tuning tools:

Specific wait events

System and enqueue statistics

Enterprise Manager performance pages

Statspack and AWR reports

RAC-proactive tuning tools:

AWR snapshots

ADDM reports

RAC and Instance or Crash Recovery

1. Global Enqueue Services (GES) remasters the enqueues resource

2. Global Cache Services (GCS) remasters the cache resources (all GCS resource requests and write requests are temporarily suspended)

3. SMON of one surviving instance builds the recovery set

4. Resource claim (Buffer space for recovery and the resources that were identified in the previous reading of the redo logs)

5. Roll forward recovery set (GRD is now unfrozen. Any data blocks that are not in recovery can now be accessed)

Instance Recovery and Database Availability

A. Real Application Clusters is running on multiple nodes.

B. Node failure is detected.

C. The enqueue part of the GRD is reconfigured; resource management is redistributed to the surviving nodes. This operation occurs relatively quickly.

D. The cache part of the GRD is reconfigured and SMON reads the redo log of the failed instance to identify the database blocks that it needs to recover.

E. SMON issues the GRD requests to obtain all the database blocks it needs for recovery. After the requests are complete, all other blocks are accessible.

F. The Oracle server performs roll forward recovery. Redo logs of the failed threads are applied to the database, and blocks are available right after their recovery is completed.

G. The Oracle server performs rollback recovery. Undo blocks are applied to the database for all uncommitted transactions.

H. Instance recovery is complete and all data is accessible.

Instance Recovery and RAC

單機資料庫中,FAST_START_MTTR_TARGET引數代表的時間包括例項啟動和recover的時間。

RAC環境中,此引數設定的意義不大。_FAST_START_INSTANCE_RECOVERY_TARGET引數設定了從instance recovery開始到GRD unfrozen,無須恢復的資料庫開放訪問開始的時間。

Here are some guidelines you can use to make sure that instance recovery in your RAC environment is faster:

Use parallel instance recovery. (RECOVERY_PARALLISM)

Increase PARALLEL_EXECUTION_MESSAGE_SIZE. (推薦設定4KB8KB)

Set PARALLEL_MIN_SERVERS. (推薦設定CPU_COUNT-1)

Use Async I/O.

Increase the size of the default buffer cache. (預設使用50%buffer cache進行恢復,關注alertlog,如有必要可調大buffer cache)

Analyzing Cache Fusion Impact in RAC

The cost of block access and cache coherency is represented by:

Global Cache Services statistics

Global Cache Services wait events

The response time for Cache Fusion transfers is determined by:

Overhead of the physical interconnect components

IPC protocol

GCS protocol

The response time is not generally affected by disk I/O factors.

The time to process CR block request in the cache (build time + flush time + send time)

The time to process current block request in the cache (pin time + flush time + send time)

Wait Events for RAC

Wait times are attributed to events that reflect the outcome of a request:

Placeholders while waiting

Precise events after waiting

Global cache waits are summarized in a broader category called Cluster Wait Class.

Global Cache Wait Events: Overview

gc current/cr request: act as placeholders until the request completes

gc [current/cr] [2/3]-way: The request is processed immediately

gc [current/cr] block busy:

gc [current/cr] grant 2-way: A grant is given without any significant delays (disk read followed)

gc current grant busy

gc [current/cr] [block/grant] congested: The request spent more than 1 ms in internal queues

gc [current/cr] [failure/retry]

gc buffer busy

2-way Block Request: Example

1. SGA1 sends a request to SGA2, begins to wait on gc current block request.

2. LGWR of SGA2 flushes recovery information to its local redo log files.

3. SGA2 sends the block to SGA1, when the block arrives, the wait event changes to gc current block 2-way

R(send) + W(small msg) + S(process msg,process block,send) + W(block) + R(receive block)

3-way Block Request: Example

R(send) + W(small msg) + S(process msg,send) + W(small msg) + S(process msg,process

block,send) + W(block) + R(receive block)

2-way Grant: Example

R(send) + W(small msg) + S(process msg,send) + W(small msg) + R(receive block)

Global Enqueue Waits: Overview

Enqueues are synchronous.

Enqueues are global resources in RAC.

The most frequent waits are for:

TX: Transaction enqueue; used for transaction demarcation and tracking

TM: Table or partition enqueue; used to protect table definitions during DML operations

HW: High-water mark enqueue; acquired to synchronize a new block operation

SQ: Sequence enqueue; used to serialize incrementing of an Oracle sequence number

US: Undo segment enqueue; mainly used by the Automatic Undo Management (AUM) feature

TA: Enqueue used mainly for transaction recovery as part of instance recovery

The waits may constitute serious serialization points.

Session and System Statistics

Use V$SYSSTAT to characterize the workload.

Use V$SESSTAT to monitor important sessions.

V$SEGMENT_STATISTICS includes RAC statistics.

RAC-relevant statistic groups are:

Global Cache Service statistics

Global Enqueue Service statistics

Statistics for messages sent

V$ENQUEUE_STATISTICS determines the enqueue with the highest impact.

V$INSTANCE_CACHE_TRANSFER breaks down GCS statistics into block classes.

Most Common RAC Tuning Tips

Application tuning is often the most beneficial

Resizing and tuning the buffer cache

Reducing long full-table scans in OLTP systems

Using Automatic Segment Space Management

Increasing sequence caches

Using partitioning to reduce interinstance traffic

Avoiding unnecessary parsing

Minimizing locking usage

Removing unselective indexes

Configuring interconnect properly

Index Block Contention: Considerations

Wait events: enq: TX – index contention / gc buffer busy / gc current block busy / gc current split

加劇索引塊競爭的因素:key單調增加、頻繁的leaf block splitsLow tree depth

Ways to alleviate the performance impact of globally hot index blocks and leaf block splits:

Global index hash partitioning

Increasing the sequence cache, if the key value is derived from a sequence

Use natural keys as opposed to surrogate keys

Use reverse key indexes

Oracle Sequences and Index Contention

sequence用作生成主鍵時,容易造成索引塊的競爭。增大sequencecache值,有利於減少索引塊的競爭,提高leaf blockinstance affinity

Undo Block Considerations

當包含活動事務的索引塊被多個instance同時訪問時,會造成劇烈的undo block shipment。這種情況經常發生在應用程式頻繁對最近插入的資料進行讀取時。如果資料插入相對於commit操作來說過於頻繁的話,一般會發生這種情況。解決方法包括增加提交次數以及增大sequencecache值等。

High-Water Mark Considerations

Wait events: enq: HW – contention / gc current grant

當插入操作頻繁導致segment頻繁擴充套件時容易造成HWM enqueue。使用uniform and large extent sizes for locally managed and automatic space management segments能減輕這種情況。

Concurrent Cross-Instance Calls: Considerations

在資料倉儲應用中較為常見的truncate動作,如在多個例項中併發進行可能會造成一定的問題。這是由於truncate操作會產生一個cross-instance的呼叫,即髒塊的flush動作,這會產生一個serialization點。其他cross-instance的呼叫也可能引發類似的問題。

RAC-Specific ADDM Findings

Hot block (with block details) with high read/write contention within an instance and across the cluster

Hot object with high read/write contention within an instance and across the cluster

Cluster interconnect latency issues in a RAC environment

LMS congestion issues: LMS processes are not able to keep up with lock requests.

Top SQL that encounters interinstance messaging

Contention on other instances: Basically, multiple instances are updating the same set of blocks concurrently.

[@more@]

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/8558093/viewspace-1015731/,如需轉載,請註明出處,否則將追究法律責任。

相關文章