mysql crash with disk sector error

Steven1981發表於2011-01-11

mysql: 5.1.40
os: RHEL 5u4
DISK: 12*300GSAS RAID10 , 512M CACHE

InnoDB: Error: tried to read 16384 bytes at offset 0 151273472.
InnoDB: Was only able to read 4096.
110111 8:52:18 InnoDB: Operating system error number 22 in a file operation.
InnoDB: Error number 22 means 'Invalid argument'.

[@more@]

################################# master-error.log
InnoDB: Error: tried to read 16384 bytes at offset 0 151273472.
InnoDB: Was only able to read 4096.
110111 8:52:18 InnoDB: Operating system error number 22 in a file operation.
InnoDB: Error number 22 means 'Invalid argument'.
InnoDB: Some operating system error numbers are described at
InnoDB: http://dev.mysql.com/doc/refman/5.1/en/operating-system-error-codes.html
InnoDB: File operation call: 'read'.
InnoDB: Cannot continue operation.
mysqld: my_new.cc:51: int __cxa_pure_virtual(): Assertion `! "Aborted: pure virtual method called."' failed.
110111 08:52:20 mysqld_safe Number of processes running now: 0
110111 08:52:20 mysqld_safe mysqld restarted
InnoDB: Log scan progressed past the checkpoint lsn 2652 2422336294
110111 8:52:24 InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Doing recovery: scanned up to log sequence number 2652 2427578880
InnoDB: Doing recovery: scanned up to log sequence number 2652 2432821760
InnoDB: Doing recovery: scanned up to log sequence number 2652 2438064640
InnoDB: Doing recovery: scanned up to log sequence number 2652 2443307520
InnoDB: Doing recovery: scanned up to log sequence number 2652 2448550400
InnoDB: Doing recovery: scanned up to log sequence number 2652 2453793280
InnoDB: Doing recovery: scanned up to log sequence number 2652 2459036160
InnoDB: Doing recovery: scanned up to log sequence number 2652 2464279040
InnoDB: Doing recovery: scanned up to log sequence number 2652 2469521920
InnoDB: Doing recovery: scanned up to log sequence number 2652 2474764800
InnoDB: Doing recovery: scanned up to log sequence number 2652 2480007680
InnoDB: Doing recovery: scanned up to log sequence number 2652 2485250560
InnoDB: Doing recovery: scanned up to log sequence number 2652 2486578101
InnoDB: 1 transaction(s) which must be rolled back or cleaned up
InnoDB: in total 2 row operations to undo
InnoDB: Trx id counter is 3 3412602624
110111 8:52:37 InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
InnoDB: Apply batch completed
InnoDB: In a MySQL replication slave the last master binlog file
InnoDB: position 0 3573539, file name mysql-bin.002074
InnoDB: Last MySQL binlog file position 0 274872768, file name /home/mysql/data/mysql/mysql-bin.000194
InnoDB: Starting in background the rollback of uncommitted transactions
110111 8:56:53 InnoDB: Rolling back trx with id 3 3412602355, 2 rows to undo
110111 8:56:53 InnoDB: Started; log sequence number 2652 2486578101
110111 8:56:53 [Note] Recovering after a crash using /home/mysql/data/mysql/mysql-bin

InnoDB: Rolling back of trx id 3 3412602355 completed
110111 8:56:53 InnoDB: Rollback of non-prepared transactions completed
110111 8:56:55 [Note] Starting crash recovery...
110111 8:56:55 [Note] Crash recovery finished.
110111 8:56:55 [Note] Event Scheduler: Loaded 0 events
110111 8:56:55 [Note] /usr/sbin/mysqld: ready for connections.

################################# 鷹眼報警
Hardware Monitor Notice: Critical - 1 disk(s) degraded, 1 disk(s) failed at Ctrler 0 and Slot 7 Disk is Failed Merr is 0

################################# dmesg 資訊
sd 0:2:0:0: SCSI error: return code = 0x00070002
end_request: I/O error, dev sda, sector 3151974485
sd 0:2:0:0: SCSI error: return code = 0x00070002
end_request: I/O error, dev sda, sector 3151974525


所以今天早上的CRASH基本可以判斷為磁碟有壞道導致CRASH。但這也需要非常的巧合;
當時MYSQL在讀16K資料的時候(read 16384 bytes at offset 0 151273472),讀到4K後發生了一個盤磁碟損壞(剛好這裡有壞道)。
我們是RAID10,正常情況下,RAID1中的某個盤壞了,RAID系統自動會到RAID1的另一個盤中讀資料,但這僅當"盤FAIL“的時候會這樣,
當只是有壞道的時候,RAID系統不作這樣的切換;而MYSQL接受到了一個錯誤返回;

也就是說,如果當時讀的不是這個壞道的資料,也許就不會CRASH ;

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/703656/viewspace-1044445/,如需轉載,請註明出處,否則將追究法律責任。

相關文章