Oracle ASM Rebalance執行過程
磁碟組的rebalance什麼時候能完成?這沒有一個具體的數值,但ASM本身已經給你提供了一個估算值(GV$ASM_OPERATION.EST_MINUTES),想知道rebalance完成的精確的時間,雖然不能給出一個精確的時間,但是可以檢視一些rebalance的操作細節,讓你知道當前rebalance是否正在進行中,進行到哪個階段,以及這個階段是否需要引起你的關注。
理解rebalance
rebalance操作本身包含了3個階段-planning, extents relocation 和 compacting,就rebalance需要的總時間而言,planning階段需要的時間是非常少的,你通常都不用去關注這一個階段,第二個階段extent relocation一般會佔取rebalance階段的大部分時間,也是我們最為需要關注的階段,最後我們也會講述第三階段compacting階段在做些什麼。
首先需要明白為什麼會需要做rebalance,如果你為了增加磁碟組的可用空間,增加了一塊新磁碟或者為了調整磁碟的空間,例如resizing或者刪除磁碟,你可能也不會太去關注rebalance啥時候完成。但是,如果磁碟組中的一塊磁碟損壞了,這個時候你就有足夠的理由關注rebalance的進度了,假如,你的磁碟組是normal冗餘的,這個時候萬一你損壞磁碟的partner磁碟也損壞,那麼你的整個磁碟組會被dismount,所有跑在這個磁碟組上的資料庫都會crash,你可能還會丟失資料。在這種情況下,你非常需要知道rebalance什麼時候完成,實際上,你需要知道第二個階段extent relocation什麼時候完成,一旦它完成了,整個磁碟組的冗餘就已經完成了(第三個階段對於冗餘度來說並不重要,後面會介紹)。
Extents relocation
為了進一步觀察extents relocation階段,我刪除了具有預設並行度的磁碟組上的一塊磁碟:
SQL> show parameter power NAME TYPE VALUE ------------------------------------ ---------------------- ------------------------------ asm_power_limit integer 1 14:47:35 SQL> select group_number,disk_number,name,state,path,header_status from v$asm_disk where group_number=5; GROUP_NUMBER DISK_NUMBER NAME STATE PATH HEADER_STATUS ------------ ----------- -------------------- -------------------- -------------------- -------------------- 5 0 TESTDG_0000 NORMAL /dev/raw/raw7 MEMBER 5 2 TESTDG_0002 NORMAL /dev/raw/raw13 MEMBER 5 1 TESTDG_0001 NORMAL /dev/raw/raw12 MEMBER 5 3 TESTDG_0003 NORMAL /dev/raw/raw14 MEMBER 14:48:38 SQL> alter diskgroup testdg drop disk TESTDG_0000; Diskgroup altered.
下面檢視GV$ASMOPERATION的ESTMINUTES欄位給出了估算值的時間,單位為分鐘,這裡給出的估算時間為9分鐘。
14:49:04 SQL> select inst_id, operation, state, power, sofar, est_work, est_rate, est_minutes from gv$asm_operation where group_number=5; INST_ID OPERATION STATE POWER SOFAR EST_WORK EST_RATE EST_MINUTES ---------- -------------------- -------------------- ---------- ---------- ---------- ---------- ----------- 1 REBAL RUN 1 4 4748 475 9
大約過了1分鐘後,EST_MINUTES的值變為了0分鐘:
14:50:22 SQL> select inst_id, operation, state, power, sofar, est_work, est_rate, est_minutes from gv$asm_operation where group_number=5; INST_ID OPERATION STATE POWER SOFAR EST_WORK EST_RATE EST_MINUTES ---------- -------------------- -------------------- ---------- ---------- ---------- ---------- ----------- 1 REBAL RUN 1 3030 4748 2429 0
有些時候EST_MINUTES的值可能並不能給你太多的證據,我們還可以看到SOFAR(截止目前移動的UA數)的值一直在增加,恩,不錯,這是一個很好的一個觀察指標。ASM的alert日誌中也顯示了刪除磁碟的操作,以及OS ARB0程式的ID,ASM用它用來做所有的rebalance工作。更重要的,整個過程之中,沒有任何的錯誤輸出:
SQL> alter diskgroup testdg drop disk TESTDG_0000 NOTE: GroupBlock outside rolling migration privileged region NOTE: requesting all-instance membership refresh for group=5 Tue Jan 10 14:49:01 2017 GMON updating for reconfiguration, group 5 at 222 for pid 42, osid 6197 NOTE: group 5 PST updated. Tue Jan 10 14:49:01 2017 NOTE: membership refresh pending for group 5/0x97f863e8 (TESTDG) GMON querying group 5 at 223 for pid 18, osid 5012 SUCCESS: refreshed membership for 5/0x97f863e8 (TESTDG) NOTE: starting rebalance of group 5/0x97f863e8 (TESTDG) at power 1 Starting background process ARB0 SUCCESS: alter diskgroup testdg drop disk TESTDG_0000 Tue Jan 10 14:49:04 2017 ARB0 started with pid=39, OS id=25416 NOTE: assigning ARB0 to group 5/0x97f863e8 (TESTDG) with 1 parallel I/O cellip.ora not found. NOTE: F1X0 copy 1 relocating from 0:2 to 2:2 for diskgroup 5 (TESTDG) NOTE: F1X0 copy 3 relocating from 2:2 to 3:2599 for diskgroup 5 (TESTDG) Tue Jan 10 14:49:13 2017 NOTE: Attempting voting file refresh on diskgroup TESTDG NOTE: Refresh completed on diskgroup TESTDG. No voting file found. Tue Jan 10 14:51:05 2017 NOTE: stopping process ARB0 SUCCESS: rebalance completed for group 5/0x97f863e8 (TESTDG) Tue Jan 10 14:51:07 2017 NOTE: GroupBlock outside rolling migration privileged region NOTE: requesting all-instance membership refresh for group=5 Tue Jan 10 14:51:10 2017 GMON updating for reconfiguration, group 5 at 224 for pid 39, osid 25633 NOTE: group 5 PST updated. SUCCESS: grp 5 disk TESTDG_0000 emptied NOTE: erasing header on grp 5 disk TESTDG_0000 NOTE: process _x000_+asm1 (25633) initiating offline of disk 0.3915944675 (TESTDG_0000) with mask 0x7e in group 5 NOTE: initiating PST update: grp = 5, dsk = 0/0xe96892e3, mask = 0x6a, op = clear GMON updating disk modes for group 5 at 225 for pid 39, osid 25633 NOTE: group TESTDG: updated PST location: disk 0001 (PST copy 0) NOTE: group TESTDG: updated PST location: disk 0002 (PST copy 1) NOTE: group TESTDG: updated PST location: disk 0003 (PST copy 2) NOTE: PST update grp = 5 completed successfully NOTE: initiating PST update: grp = 5, dsk = 0/0xe96892e3, mask = 0x7e, op = clear GMON updating disk modes for group 5 at 226 for pid 39, osid 25633 NOTE: cache closing disk 0 of grp 5: TESTDG_0000 NOTE: PST update grp = 5 completed successfully GMON updating for reconfiguration, group 5 at 227 for pid 39, osid 25633 NOTE: cache closing disk 0 of grp 5: (not open) TESTDG_0000 NOTE: group 5 PST updated. NOTE: membership refresh pending for group 5/0x97f863e8 (TESTDG) GMON querying group 5 at 228 for pid 18, osid 5012 GMON querying group 5 at 229 for pid 18, osid 5012 NOTE: Disk TESTDG_0000 in mode 0x0 marked for de-assignment SUCCESS: refreshed membership for 5/0x97f863e8 (TESTDG) Tue Jan 10 14:51:16 2017 NOTE: Attempting voting file refresh on diskgroup TESTDG NOTE: Refresh completed on diskgroup TESTDG. No voting file found.
因此ASM預估了9分鐘的時間來完成rebalance,但實際上只使用了2分鐘的時候,因此首先能知道rebalance正在做什麼非常重要,然後才能知道rebalance什麼時候能完成。注意,估算的時間是動態變化的,可能會增加或減少,這個依賴你的系統負載變化,以及你的rebalance的power值的設定,對於一個非常大容量的磁碟組來說,可能rebalance會花費你數小時甚至是數天的時間。
ARB0程式的跟蹤檔案也顯示了,當前正在對哪一個ASM檔案的extent的在進行重分配,也是透過這個跟蹤檔案,我們可以知道ARB0確實是在幹著自己的本職工作,沒有偷懶。
[grid@jyrac1 trace]$ tail -f +ASM1_arb0_25416.trc *** 2017-01-10 14:49:20.160 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:49:24.081 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:49:28.290 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:49:32.108 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:49:35.419 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:49:38.921 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:49:43.613 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:49:47.523 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:49:51.073 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:49:54.545 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:49:58.538 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:50:02.944 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:50:06.428 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:50:10.035 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:50:13.507 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:50:17.526 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:50:21.692 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:50:25.649 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:50:29.360 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:50:33.233 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:50:37.287 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:50:40.843 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:50:44.356 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:50:48.158 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:50:51.854 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:50:55.568 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:50:59.439 ARB0 relocating file +TESTDG.256.932913341 (120 entries) *** 2017-01-10 14:51:02.877 ARB0 relocating file +TESTDG.256.932913341 (50 entries)
注意,跟蹤目錄下的arb0的跟蹤檔案可能會有很多,因此我們需要知道arb0的OS是程式號,是哪一個arb0在實際做rebalance的工作,這個資訊在ASM例項執行rebalance操作的時候,alert檔案中會有顯示。我們還可以透過作業系統命令pstack來跟蹤ARB0程式,檢視具體它在做什麼,如下,它向我們顯示了,ASM正在重分配extent(在堆疊中的關鍵函式 kfgbRebalExecute - kfdaExecute - kffRelocate):
[root@jyrac1 ~]# pstack 25416 #0 0x0000003aa88005f4 in ?? () from /usr/lib64/libaio.so.1 #1 0x0000000002bb9b11 in skgfrliopo () #2 0x0000000002bb9909 in skgfospo () #3 0x00000000086c595f in skgfrwat () #4 0x00000000085a4f79 in ksfdwtio () #5 0x000000000220b2a3 in ksfdwat_internal () #6 0x0000000003ee7f33 in kfk_reap_ufs_async_io () #7 0x0000000003ee7e7b in kfk_reap_ios_from_subsys () #8 0x0000000000aea0ac in kfk_reap_ios () #9 0x0000000003ee749e in kfk_io1 () #10 0x0000000003ee7044 in kfkRequest () #11 0x0000000003eed84a in kfk_transitIO () #12 0x0000000003e40e7a in kffRelocateWait () #13 0x0000000003e67d12 in kffRelocate () #14 0x0000000003ddd3fb in kfdaExecute () #15 0x0000000003ec075b in kfgbRebalExecute () #16 0x0000000003ead530 in kfgbDriver () #17 0x00000000021b37df in ksbabs () #18 0x0000000003ec4768 in kfgbRun () #19 0x00000000021b8553 in ksbrdp () #20 0x00000000023deff7 in opirip () #21 0x00000000016898bd in opidrv () #22 0x0000000001c6357f in sou2o () #23 0x00000000008523ca in opimai_real () #24 0x0000000001c6989d in ssthrdmain () #25 0x00000000008522c1 in main ()
Compacting
在下面的例子裡,我們來看下rebalance的compacting階段,我把上面刪除的磁碟加回來,同時設定rebalance的power為2:
17:26:48 SQL> alter diskgroup testdg add disk '/dev/raw/raw7' rebalance power 2; Diskgroup altered.
ASM給出的rebalance的估算時間為6分鐘:
16:07:13 SQL> select INST_ID, OPERATION, STATE, POWER, SOFAR, EST_WORK, EST_RATE, EST_MINUTES from GV$ASM_OPERATION where GROUP_NUMBER=1; INST_ID OPERA STAT POWER SOFAR EST_WORK EST_RATE EST_MINUTES ---------- ----- ---- ---------- ---------- ---------- ---------- ----------- 1 REBAL RUN 10 489 53851 7920 6
大約10秒後,EST_MINUTES的值變為0.
16:07:23 SQL> / INST_ID OPERA STAT POWER SOFAR EST_WORK EST_RATE EST_MINUTES ---------- ----- ---- ---------- ---------- ---------- ---------- ----------- 1 REBAL RUN 10 92407 97874 8716 0
這個時候我們在ASM的alert日誌中觀察到:
SQL> alter diskgroup testdg add disk '/dev/raw/raw7' rebalance power 2 NOTE: GroupBlock outside rolling migration privileged region NOTE: Assigning number (5,0) to disk (/dev/raw/raw7) NOTE: requesting all-instance membership refresh for group=5 NOTE: initializing header on grp 5 disk TESTDG_0000 NOTE: requesting all-instance disk validation for group=5 Tue Jan 10 16:07:12 2017 NOTE: skipping rediscovery for group 5/0x97f863e8 (TESTDG) on local instance. NOTE: requesting all-instance disk validation for group=5 NOTE: skipping rediscovery for group 5/0x97f863e8 (TESTDG) on local instance. Tue Jan 10 16:07:12 2017 GMON updating for reconfiguration, group 5 at 230 for pid 42, osid 6197 NOTE: group 5 PST updated. NOTE: initiating PST update: grp = 5 GMON updating group 5 at 231 for pid 42, osid 6197 NOTE: PST update grp = 5 completed successfully NOTE: membership refresh pending for group 5/0x97f863e8 (TESTDG) GMON querying group 5 at 232 for pid 18, osid 5012 NOTE: cache opening disk 0 of grp 5: TESTDG_0000 path:/dev/raw/raw7 GMON querying group 5 at 233 for pid 18, osid 5012 SUCCESS: refreshed membership for 5/0x97f863e8 (TESTDG) NOTE: starting rebalance of group 5/0x97f863e8 (TESTDG) at power 1 SUCCESS: alter diskgroup testdg add disk '/dev/raw/raw7' Starting background process ARB0 Tue Jan 10 16:07:14 2017 ARB0 started with pid=27, OS id=982 NOTE: assigning ARB0 to group 5/0x97f863e8 (TESTDG) with 1 parallel I/O cellip.ora not found. Tue Jan 10 16:07:23 2017 NOTE: Attempting voting file refresh on diskgroup TESTDG
上面的輸出意味著ASM已經完成了rebalance的第二個階段,開始了第三個階段compacting,如果我說的沒錯,透過pstack工具可以看到kfdCompact()函式,下面的輸出顯示,確實如此:
# pstack 982 #0 0x0000003957ccb6ef in poll () from /lib64/libc.so.6 ... #9 0x0000000003d711e0 in kfk_reap_oss_async_io () #10 0x0000000003d70c17 in kfk_reap_ios_from_subsys () #11 0x0000000000aea50e in kfk_reap_ios () #12 0x0000000003d702ae in kfk_io1 () #13 0x0000000003d6fe54 in kfkRequest () #14 0x0000000003d76540 in kfk_transitIO () #15 0x0000000003cd482b in kffRelocateWait () #16 0x0000000003cfa190 in kffRelocate () #17 0x0000000003c7ba16 in kfdaExecute () #18 0x0000000003c4b737 in kfdCompact () #19 0x0000000003c4c6d0 in kfdExecute () #20 0x0000000003d4bf0e in kfgbRebalExecute () #21 0x0000000003d39627 in kfgbDriver () #22 0x00000000020e8d23 in ksbabs () #23 0x0000000003d4faae in kfgbRun () #24 0x00000000020ed95d in ksbrdp () #25 0x0000000002322343 in opirip () #26 0x0000000001618571 in opidrv () #27 0x0000000001c13be7 in sou2o () #28 0x000000000083ceba in opimai_real () #29 0x0000000001c19b58 in ssthrdmain () #30 0x000000000083cda1 in main ()
透過tail命令檢視ARB0的跟蹤檔案,發現relocating正在進行,而且一次只對一個條目進行relocating。(這是正進行到compacting階段的另一個重要線索):
$ tail -f +ASM1_arb0_25416.trc ARB0 relocating file +DATA1.321.788357323 (1 entries) ARB0 relocating file +DATA1.321.788357323 (1 entries) ARB0 relocating file +DATA1.321.788357323 (1 entries) ...
compacting過程中,V$ASM_OPERATION檢視的EST_MINUTES欄位會顯示為0(也是一個重要線索):
16:08:56 SQL> / INST_ID OPERA STAT POWER SOFAR EST_WORK EST_RATE EST_MINUTES ---------- ----- ---- ---------- ---------- ---------- ---------- ----------- 2 REBAL RUN 10 98271 98305 7919 0
固態表X$KFGMG的REBALST_KFGMG欄位會顯示為2,代表正在compacting。
16:09:12 SQL> select NUMBER_KFGMG, OP_KFGMG, ACTUAL_KFGMG, REBALST_KFGMG from X$KFGMG; NUMBER_KFGMG OP_KFGMG ACTUAL_KFGMG REBALST_KFGMG ------------ ---------- ------------ ------------- 1 1 10 2
一旦compacting階段完成,ASM的alert 日誌中會顯示stopping process ARB0 和rebalance completed:
Tue Jan 10 16:10:19 2017 NOTE: stopping process ARB0 SUCCESS: rebalance completed for group 5/0x97f863e8 (TESTDG)
一旦extents relocation完成,所有的資料就已經滿足了冗餘度的要求,不再會擔心已經失敗磁碟的partern磁碟再次失敗而出現嚴重故障。
Changing the power
Rebalance的power可以在磁碟組rebalance過程中動態的更改,如果你認為磁碟組的預設級別太低了,可以去很容易的增加它。但是增加到多少呢?這個需要你根據你係統的IO負載,IO吞吐量來定。一般情況下,你可以先嚐試增加到一個保守的值,例如5,過上十分鐘看是否有所提升,以及是否影響到了其他業務對IO的使用,如果你的IO效能非常強,那麼可以繼續增加power的值,但是就我的經驗來看,很少能看到power 的設定超過30後還能有較大提升的。測試的關鍵點在於,你需要在你生產系統的正常負載下去測試,不同的業務壓力,不同的儲存系統,都可能會讓rebalance時間產生較大的差異。
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/25462274/viewspace-2156413/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- Oracle資料庫SQL語句執行過程Oracle資料庫SQL
- 【SQL】Oracle避免動態SQL,提高過程執行效率SQLOracle
- Java 程式執行過程Java
- jsp的執行過程JS
- 指令的執行過程
- Linux 6.9 加盤後的Oracle 12c ASM DiskGroup配置過程LinuxOracleASM
- 執行緒池建立執行緒的過程執行緒
- Jtti:如何修復Oracle資料庫執行過程的問題JttiOracle資料庫
- 程式語言執行過程
- webpack loader 的執行過程Web
- MapReduce 執行全過程解析
- Redis 命令的執行過程Redis
- 不為人知的技術--Oracle並行非同步執行儲存過程Oracle並行非同步儲存過程
- Oracle ASM使用asmcmd中的cp命令來執行遠端複製OracleASM
- crtmpserver 執行過程簡明分析Server
- 程式碼精簡執行過程
- 一條Sql的執行過程SQL
- mysql執行sql語句過程MySql
- Javascript中new的執行過程JavaScript
- Informix 執行緒sleep 分析過程ORM執行緒
- javascript引擎執行的過程的理解--執行階段JavaScript
- 模擬主執行緒等待子執行緒的過程執行緒
- 原始碼分析OKHttp的執行過程原始碼HTTP
- 淺析Java程式的執行過程Java
- KVC中setValue:forKey:的執行過程
- 瀏覽器EventLoop執行過程解析瀏覽器OOP
- js函式執行過程的探究JS函式
- 通過 HelloWorld 瞭解 Java 程式執行過程以及執行時記憶體Java記憶體
- oracle索引核心過程Oracle索引
- Oracle儲存過程Oracle儲存過程
- JS引擎執行緒的執行過程的三個階段JS執行緒
- maven外掛執行過程中自動執行sql檔案MavenSQL
- SAP Commerce Cloud ASM 模組的登入過程CloudASM
- 通過ORACLE VM virtualbox環境安裝oracle 11G RAC(ASM)OracleASM
- 執行時的頁面構建過程
- Java執行緒池的增長過程Java執行緒
- MySQL學習 - 查詢的執行過程MySql
- MySQL innodb引擎的事務執行過程MySql