HP VA7110 硬碟故障重建失敗處理
使用者VA7110 有一硬碟故障,VA自動rebuild失敗,11月25更換硬碟後rebuild仍然失敗,之後VA做balance一週都未完成,I/O比較慢,資料庫checkpoint時間最高達到200多秒,業務嚴重受到影響
#armdsp -a va
Vendor ID:______________________________HP
Product ID:_____________________________A6189B
Array World Wide Name:__________________50060b00001535a8
Array Serial Number:____________________00SG324J0103
Alias:__________________________________va
Software Revision:______________________1.09.02 - 0191 - 060113
Command execution timestamp:____________Nov 24, 2008 5:31:30 PM
------------------------------------------------------------
ARRAY INFORMATION
Array Status:_________________________Warning
Firmware Revision:____________________38370A140P0513051631
Product Revision:_____________________A140
Local Controller Product Revision:____A140
Remote Controller Product Revision:___A140
Last Event Log Entry for Page 1:______140908
Last Event Log Entry for Page 2:______140896
Last Event Log Entry for Page 5:______131219
ENCLOSURES
Enclosure at M
Enclosure ID__________________________0
Enclosure Status______________________Failed
Enclosure Type________________________HP StorageWorks Virtual Array 7110
Node WWN______________________________50060b00001535a8
FRU HW COMPONENT IDENTIFICATION ID STATUS
===========================================================================
M Enclosure 00SG324J0103 Failed
M/P1 Power Supply 94030JD01148 Good
M/P2 Power Supply 94030JD01145 Good
M/MP1 MidPlane 000617570117 Good
M/C2 Controller 00PR00D50084 Good
M/C2.H1 Host Port
M/C2.J1 BackEnd Port
M/C2.B1 Battery 44298:MOLTECHPS:NI2040:2003/3/12 Good
M/C2.PM1 Processor HP:A6189B:A140 Good
M/C2.M1 DIMM 512 Good
M/C2.M2 DIMM 512 Good
M/C1 Controller 00PR00D50060 Good
M/C1.H1 Host Port
M/C1.J1 BackEnd Port
M/C1.B1 Battery 44304:MOLTECHPS:NI2040:2003/3/12 Good
M/C1.PM1 Processor HP:A6189B:A140 Good
M/C1.M1 DIMM 512 Good
M/C1.M2 DIMM 512 Good
M/D1 Disk 3HX0WX3G Good
M/D2 Disk 3HX0X2RN Good
M/D3 Disk 3HX0XBBG Failed
M/D4 Disk 3HX0X625 Good
CONTROLLERS
Controller At M/C2:
Status:_______________________________Good
Serial Number:________________________00PR00D50084
Vendor ID:____________________________HP
Product ID:___________________________A6189B
Product Revision:_____________________A140
Firmware Revision:____________________38370A140P0513051631
Manufacturing Product Code:___________IJMTU00016
Controller Type:______________________HP StorageWorks Virtual Array 7110
Battery Charger Firmware Revision:____5.0
Front Port At M/C2.H1:
Status:_____________________________Good
Port Instance:______________________0
Hard Address:_______________________126
Link State:_________________________Link Up
Node WWN:___________________________50060b00001535a8
Port WWN:___________________________50060b00001b7a28
Topology:___________________________Point To Point, Fabric Attached
Data Rate:__________________________2 GBit/sec
Port ID:____________________________0x10000
Device Host Name:___________________zhsmp1
Hardware Path:______________________0/6/2/0.1.0.0.0.0.0
Device Path:________________________/dev/dsk/c6t0d0
Back Port At M/C2.J1:
Status:_____________________________Good
Port Instance:______________________0
Hard Address:_______________________125
Link State:_________________________Link Up
Node WWN:___________________________50060b00001535a8
Port WWN:___________________________50060b00001b7a29
Topology:___________________________Private Loop
Data Rate:__________________________2 GBit/sec
Port ID:____________________________125
Battery at M/C2.B1:
Status:_____________________________Good
Identification:_____________________44298:MOLTECHPS:NI2040:2003/3/12
Manufacturer Name:__________________MOLTECHPS
Device Name:________________________NI2040
Manufacturer Date:__________________March 12, 2003
Remaining Capacity:_________________5700 mAh
Remaining Capacity:_________________95 %
Voltage:____________________________12349 mVolts
Discharge Cycles:___________________2
Processor at M/C2.PM1:
Status:_____________________________Good
Identification:_____________________HP:A6189B:A140
DIMM at M/C2.M1:
Status:_____________________________Good
Identification:_____________________512
Capacity:___________________________512 MB
DIMM at M/C2.M2:
Status:_____________________________Good
Identification:_____________________512
Capacity:___________________________512 MB
Controller At M/C1:
Status:_______________________________Good
Serial Number:________________________00PR00D50060
Vendor ID:____________________________HP
Product ID:___________________________A6189B
Product Revision:_____________________A140
Firmware Revision:____________________38370A140P0513051631
Manufacturing Product Code:___________IJMTU00016
Controller Type:______________________HP StorageWorks Virtual Array 7110
Battery Charger Firmware Revision:____5.0
Front Port At M/C1.H1:
Status:_____________________________Good
Port Instance:______________________0
Hard Address:_______________________126
Link State:_________________________Link Up
Node WWN:___________________________50060b00001535a8
Port WWN:___________________________50060b00001b67ac
Topology:___________________________Point To Point, Fabric Attached
Data Rate:__________________________2 GBit/sec
Port ID:____________________________0x10000
Device Host Name:___________________zhsmp1
Hardware Path:______________________0/4/0/0.1.0.0.0.0.0
Device Path:________________________/dev/dsk/c4t0d0
Back Port At M/C1.J1:
Status:_____________________________Good
Port Instance:______________________0
Hard Address:_______________________125
Link State:_________________________Link Up
Node WWN:___________________________50060b00001535a8
Port WWN:___________________________50060b00001b67ad
Topology:___________________________Private Loop
Data Rate:__________________________2 GBit/sec
Port ID:____________________________125
Battery at M/C1.B1:
Status:_____________________________Good
Identification:_____________________44304:MOLTECHPS:NI2040:2003/3/12
Manufacturer Name:__________________MOLTECHPS
Device Name:________________________NI2040
Manufacturer Date:__________________March 12, 2003
Remaining Capacity:_________________5821 mAh
Remaining Capacity:_________________97 %
Voltage:____________________________12575 mVolts
Discharge Cycles:___________________2
Processor at M/C1.PM1:
Status:_____________________________Good
Identification:_____________________HP:A6189B:A140
DIMM at M/C1.M1:
Status:_____________________________Good
Identification:_____________________512
Capacity:___________________________512 MB
DIMM at M/C1.M2:
Status:_____________________________Good
Identification:_____________________512
Capacity:___________________________512 MB
PORTS
Settings for port M/C2.H1:
Port ID:______________________________108
Behavior:_____________________________HPUX
Topology:_____________________________Point To Point, Fabric Attached
Queue Full Threshold:_________________4
Data Rate:____________________________2 GBit/sec
Settings for port M/C2.J1:
Data Rate:____________________________2 GBit/sec
Settings for port M/C1.H1:
Port ID:______________________________110
Behavior:_____________________________HPUX
Topology:_____________________________Point To Point, Fabric Attached
Queue Full Threshold:_________________4
Data Rate:____________________________2 GBit/sec
Settings for port M/C1.J1:
Data Rate:____________________________2 GBit/sec
DISKS
Disk at M/D1:
Status:_______________________________Good
Disk State:___________________________Included
Vendor ID:____________________________HP 36.4G
Product ID:___________________________ST336753FC
Product Revision:_____________________HP03
Data Capacity:________________________33.378 GB (70000000 blocks)
Block Length:_________________________520 bytes
Address:______________________________111
Node WWN:_____________________________2000000c5029bcf8
Initialize State:_____________________Ready
Redundancy Group:_____________________1
Volume Set Serial Number:_____________0000C38C0000000A
Serial Number:________________________3HX0WX3G
Firmware Revision:____________________HP03
Recovery Maps are on this disk.
Space is reserved on this disk for subsystem metadata and
may be a map disk.
Disk at M/D2:
Status:_______________________________Good
Disk State:___________________________Included
Vendor ID:____________________________HP 36.4G
Product ID:___________________________ST336753FC
Product Revision:_____________________HP03
Data Capacity:________________________33.378 GB (70000000 blocks)
Block Length:_________________________520 bytes
Address:______________________________112
Node WWN:_____________________________2000000c5029bcca
Initialize State:_____________________Ready
Redundancy Group:_____________________1
Volume Set Serial Number:_____________0000C38C0000000A
Serial Number:________________________3HX0X2RN
Firmware Revision:____________________HP03
Recovery Maps are on this disk.
Space is reserved on this disk for subsystem metadata and
may be a map disk.
Disk at M/D3:
Status:_______________________________Failed
Disk State:___________________________Failed
Vendor ID:____________________________HP 36.4G
Product ID:___________________________ST336753FC
Product Revision:_____________________HP03
Data Capacity:________________________33.378 GB (70000000 blocks)
Block Length:_________________________520 bytes
Address:______________________________113
Node WWN:_____________________________2000000c5029886b
Initialize State:_____________________Ready
Redundancy Group:_____________________1
Volume Set Serial Number:_____________0000C38C0000000A
Serial Number:________________________3HX0XBBG
Firmware Revision:____________________HP03
Disk at M/D4:
Status:_______________________________Good
Disk State:___________________________Included
Vendor ID:____________________________HP 36.4G
Product ID:___________________________ST336753FC
Product Revision:_____________________HP03
Data Capacity:________________________33.378 GB (70000000 blocks)
Block Length:_________________________520 bytes
Address:______________________________114
Node WWN:_____________________________2000000c5029924c
Initialize State:_____________________Ready
Redundancy Group:_____________________1
Volume Set Serial Number:_____________0000C38C0000000A
Serial Number:________________________3HX0X625
Firmware Revision:____________________HP03
LUNS
LUN 0:
Redundancy Group:_____________________1
Active:_______________________________True
Data Capacity:________________________20 MB
WWN:__________________________________60060b00001535a80000000000000010
Number Of Business Copies:____________0
LUN 1:
Redundancy Group:_____________________1
Active:_______________________________True
Data Capacity:________________________11 GB
WWN:__________________________________60060b00001535a80001000000000011
Number Of Business Copies:____________0
LUN 2:
Redundancy Group:_____________________1
Active:_______________________________True
Data Capacity:________________________16 GB
WWN:__________________________________60060b00001535a80002000000000012
Number Of Business Copies:____________0
LUN 3:
Redundancy Group:_____________________1
Active:_______________________________True
Data Capacity:________________________11 GB
WWN:__________________________________60060b00001535a80003000000000013
Number Of Business Copies:____________0
LUN 4:
Redundancy Group:_____________________1
Active:_______________________________True
Data Capacity:________________________16 GB
WWN:__________________________________60060b00001535a80004000000000014
Number Of Business Copies:____________0
CAPACITY Totals for Redundancy Group 1:
REGULAR LUNs:_________________________54.019 GB
BUSINESS COPIES:______________________0 bytes
CAPACITY USAGE
Total Disk Enclosures:________________1
Redundancy Group:_____________________1
Total Disks:________________________3
Total Physical Size:________________100.135 GB
Allocated to Regular LUNs:__________54.019 GB
Allocated as Business Copies:_______0 bytes
Used as Active Hot Spare:___________0 bytes
Used for Redundancy:________________46.116 GB
Unallocated (Available for LUNs):___0 bytes
Used by Non-Included Disks:___________33.378 GB
VFP
Settings for VFP Serial Port M/C1.VFP:
VFP Baud Rate:________________________9600
VFP Paging Value:_____________________24
Settings for VFP Serial Port M/C2.VFP:
VFP Baud Rate:________________________9600
VFP Paging Value:_____________________24
SUB-SYSTEM SETTINGS
RAID Level:___________________________RAID1+0
Auto Format Drive:____________________On
Hang Detection:_______________________On
Capacity Depletion Threshold:_________100%
Queue Full Threshold Maximum:_________4096
Enable Optimize Policy:_______________True
Enable Manual Override:_______________False
Manual Override Destination:__________False
Read Cache Disable:___________________False
Rebuild Priority:_____________________Low
Security Enabled:_____________________False
Shutdown Completion:__________________0
Subsystem Type ID:____________________0
Unit Attention:_______________________True
Volume Set Partition (VSpart):________False
Write Cache Enable:___________________True
Write Working Set Interval:___________8640
Enable Prefetch:______________________True
Disable Secondary Path Presentation:__False
Backend Diagnostics:__________________On
RESILIENCY SETTINGS
Simplified Resiliency Setting:________Normal Performance (Default)
Enable Secure Mode:___________________True
Disable NVRAM on UPS Absent:__________False
Disable NVRAM on WCE False:___________False
Disable Read Hits:____________________False
Force Unit Access Response:___________1
Lock Write Cache On:__________________True
Performance Goal Configuration:_______Normal Performance
Resiliency Threshold:_________________4
Single Controller Warning:____________True
DISK SETTINGS
Auto Include:_________________________On
Auto Rebuild:_________________________On
Hot Spare:____________________________None
Max Drives per Loop Pair:_____________45
Max Drives per Subsystem:_____________45
ENCLOSURE SETTINGS
Max Enclosures per Loop Pair:_________2
Max Enclosures per Subsystem:_________2
LUN SETTINGS
LUN Creation Limit:___________________1024
Maximum LUN Creation Limit:___________1024
Migrating Write Destination:__________False
CRUB SETTINGS
Scrub Restart Period:_________________0 minutes
Scrub State:__________________________Not running, system in warning state
OPERATIONS IN PROGRESS
None
WARNINGS
WARNING: Unallocated capacity has fallen below the threshold specified by the capacity threshold mode parameter.
WARNING: A physical drive has failed, failed initialization, been downed or is in the previously used state.
WARNING: A rebuild operation failed.
WARNING: Some data in the device lacks redundancy, and is exposed to becoming unavailable if further drive removals or failures occur.
原因:
此次造成rebuild失敗的原因是VA的剩餘空間不足,一共4個硬碟做RAID1+0,沒有熱備盤,所以更換硬碟也不能Rebuild成功,接著VA做balance動作,每當前端有I/O讀寫時balance動作就暫停,經過一週的balance仍然未完成。
解決辦法:
由於造成rebuild失敗的原因是VA本身剩餘空間不足,因此根據HP的建議有以下辦法:
1、備份資料,格式化VA,重建所有的LUN並恢復資料。
2、刪除無用的LUN,以便騰出空間加快balance完成。
3、停止前端I/O讀寫,加快balance完成後手工rebuild。
4、新增兩塊硬碟,. 以便有更多的剩餘空間,加快balance完成。
5、將raid1+0轉換為raid5,以便騰出空間加快balance完成。
四、 處理過程
以上5種處理辦法中辦法2和辦法5顯然是不可取的,考慮到業務可以暫時停下來,首先採取了辦法3,即暫時停掉業務和資料庫,經過12月3日一個晚上balcnce仍未完成,業務還是受影響,效果不明顯。於是12月4日採取辦法2,處理過程如下:
1、備份卷組配置資訊
#vgcfgbackup vg01
#vgcfgbackup vg02
#vgcfgbackup vglock
2、停止業務,資料庫
#cmhaltcl
3、 格式化VA
#armfmt -f va
4、重建VA的LUN,跟原來一樣
#armcfg -L 0 -a 20M -g 1 va
#armcfg -L 1 -a 11G -g 1 va
#armcfg -L 2 -a 16G -g 1 va
#armcfg -L 3 -a 11G -g 1 va
#armcfg -L 4 -a 16G -g 1 va
5 、掃描硬碟,以認到以上幾個lun
#insf -C disk
6、恢復卷組、邏輯卷資訊
#vgcfgrestore -n vg01 /dev/rdsk/c4t0d1
#vgcfgrestore -n vg01 /dev/rdsk/c4t0d3
#vgcfgrestore -n vg02 /dev/rdsk/c4t0d2
#vgcfgrestore -n vg02 /dev/rdsk/c4t0d4
#vgcfgrestore -n vglock /dev/rdsk/c4t0d0
7、起雙機服務,啟用卷組
#cmruncl –v
#vgchange -a y vg01
#vgchange -a y vg02
#vgchange -a y vglock
8、 恢復資料庫
#su – Informix
$ontape –r
y
n
n
n
9、起雙機包、切換測試……
#cmrunpkg –e –v pkg_smp
[@more@]來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/9479798/viewspace-1050072/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- Oracle DG同步失敗故障處理(二)Oracle
- hp-ux磁帶備份失敗故障解決UX
- Spark 叢集執行任務失敗的故障處理Spark
- postgresql連線失敗如何處理SQL
- oracle schedule 任務失敗處理Oracle
- svn dump 失敗後的處理
- oracle對JOB失敗的處理Oracle
- 近期Hp Rx2620故障處理小記
- online 建立索引失敗處理索引
- js播放背景音樂失敗處理JS
- 膝上型電腦硬碟壞道故障處理硬碟
- 恆訊科技教你處理伺服器硬碟的故障伺服器硬碟
- Service Worker 圖片載入失敗處理
- php上傳大檔案失敗處理PHP
- 處理service named start失敗failed_dnsAIDNS
- OracleDBConsole啟動失敗處理Oracle
- 索引rebuild online失敗後處理索引Rebuild
- python的django安裝失敗如何處理PythonDjango
- linux swap掛載失敗問題處理Linux
- AndroidKiller反編譯失敗的處理方法Android編譯
- 啟用系統登入失敗處理功能
- goldengate ddl_setup執行失敗處理Go
- Jenkins執行批處理檔案失敗Jenkins
- OEM分析TNSNAME.ORA檔案失敗處理
- Oracle RAC啟動失敗(DNS故障)OracleDNS
- oracle 案例-控制檔案丟失故障處理過程Oracle
- 【故障處理】一次RAC故障處理過程
- ORA-600 [12700]故障處理一則(線上重建損壞的索引)索引
- 【Oracle故障處理】-Oracle9i臨時表空間刪除重建Oracle
- Mac openssl 未找到 / 載入失敗問題處理Mac
- Linux Yum 安裝失敗處理過程整理Linux
- en_concat函式編譯失敗處理函式編譯
- MongoDB故障處理MongoDB
- 故障分析 | Greenplum Segment 故障處理
- 硬碟/行動硬碟分割槽合併失敗資料丟失瞭如何恢復?硬碟
- wordpress外掛上傳的失敗原因和處理方案
- npm install過程失敗的幾種處理方法NPM
- 自定義Spring Security的身份驗證失敗處理Spring