mysql效能測試庫的一次CRASH恢復

Steven1981發表於2009-07-16
環境:
OS: Linux als_qa_rhel5.3 2.6.18-128.el5 #1 SMP Wed Dec 17 11:41:38 EST 2008 x86_64 x86_64 x86_64 GNU/Linux
MYSQL:MySQL-server-enterprise-gpl-5.0.68-0.rhel4.x86_64
[@more@]

一個MYSQL的效能測試資料庫, 在高壓力測試過程中, 忽然CRASH,
以下為MYSQL重啟時的,LOG日誌:

--------------------------------------------------------
090716 09:31:02 mysqld started
InnoDB: Log scan progressed past the checkpoint lsn 32 2512592438
090716 9:31:03 InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Doing recovery: scanned up to log sequence number 32 2517835264
InnoDB: Doing recovery: scanned up to log sequence number 32 2523078144
InnoDB: Doing recovery: scanned up to log sequence number 32 2528321024
InnoDB: Doing recovery: scanned up to log sequence number 32 2533563904
InnoDB: Doing recovery: scanned up to log sequence number 32 2538806784
InnoDB: Doing recovery: scanned up to log sequence number 32 2544049664
InnoDB: Doing recovery: scanned up to log sequence number 32 2549292544
InnoDB: Doing recovery: scanned up to log sequence number 32 2554535424
InnoDB: Doing recovery: scanned up to log sequence number 32 2559778304
InnoDB: Doing recovery: scanned up to log sequence number 32 2565021184
InnoDB: Doing recovery: scanned up to log sequence number 32 2570264064
InnoDB: Doing recovery: scanned up to log sequence number 32 2575506944
InnoDB: Doing recovery: scanned up to log sequence number 32 2580749824
InnoDB: Doing recovery: scanned up to log sequence number 32 2585992704
InnoDB: Doing recovery: scanned up to log sequence number 32 2591235584
InnoDB: Doing recovery: scanned up to log sequence number 32 2596478464
InnoDB: Doing recovery: scanned up to log sequence number 32 2601721344
InnoDB: Doing recovery: scanned up to log sequence number 32 2606964224
InnoDB: Doing recovery: scanned up to log sequence number 32 2612207104
InnoDB: Doing recovery: scanned up to log sequence number 32 2617449984
InnoDB: Doing recovery: scanned up to log sequence number 32 2622692864
InnoDB: Doing recovery: scanned up to log sequence number 32 2627935744
InnoDB: Doing recovery: scanned up to log sequence number 32 2633088581
090716 9:31:05 InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
InnoDB: Apply batch completed
InnoDB: Last MySQL binlog file position 0 22007701, file name ./mysql-bin.000149
090716 9:31:49 InnoDB: Started; log sequence number 32 2633088581
090716 9:31:49 [Note] Recovering after a crash using mysql-bin
090716 9:31:49 [Note] Starting crash recovery...
090716 9:31:49 [Note] Crash recovery finished.
090716 9:31:49 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.0.68-enterprise-gpl-log' socket: '/tmp/mysql.sock' port: 3306 MySQL Enterprise Server (GPL)
090716 9:31:59InnoDB: Assertion failure in thread 1177917760 in file btr0cur.c line 3606
InnoDB: Failing assertion: extern_len - part_len == 0
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to .
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.0/en/forcing-recovery.html
InnoDB: about forcing recovery.
090716 9:31:59 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=335544320
read_buffer_size=1048576
max_used_connections=2
max_connections=1000
threads_connected=2
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 2375680 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=(nil)
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
frame pointer is NULL, did you compile with
-fomit-frame-pointer? Aborting backtrace!
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

Number of processes running now: 0
090716 09:31:59 mysqld restarted
--------------------------------------------------------------

在日誌中我們看到,INNODB recovery成功後,並顯示
090716 9:31:49 [Note] /usr/sbin/mysqld: ready for connections.
這是正常的情況,但緊接著,就重新啟動了.
這時,客戶端連線不上.
但檢視MYSQL程式,卻又還在.
用SERVICE MYSQL STOP 無法停止庫. 只能用KILL -9來殺掉.

目前來看,只能是猜測遇到BUG.(這個版本在其他庫跑得比較正常)
而且這樣的BUG不能重現.有可能只有在高壓力的情況下才會出現.

遇到這樣的情況,我們最重要的是把資料備份出來.
所以儘可能把庫開啟,把資料備份出來.

比較幸運的事. 這一次資料庫INNODB表空間沒有被破壞.
有MY.CNF配置檔案中加入:
[mysqld]
innodb_force_recovery = 2

資料庫順利開啟.這時趕緊用MYSQLDUMP 做個全庫匯出.

然後把庫全部刪除, 重新安裝MYSQL.

最後把庫全部匯入...

雖然庫是恢復了. 但想找找原因.為什麼innodb_force_recovery = 2 就可以開啟了呢.
http://dev.mysql.com/doc/refman/5.0/en/forcing-recovery.html上得到:

1 (SRV_FORCE_IGNORE_CORRUPT)

Let the server run even if it detects a corrupt page. Try to make SELECT * FROM tbl_name jump over corrupt index records and pages, which helps in dumping tables.

2 (SRV_FORCE_NO_BACKGROUND)

Prevent the main thread from running. If a crash would occur during the purge operation, this recovery value prevents it.

3 (SRV_FORCE_NO_TRX_UNDO)

Do not run transaction rollbacks after recovery.

4 (SRV_FORCE_NO_IBUF_MERGE)

Prevent insert buffer merge operations. If they would cause a crash, do not do them. Do not calculate table statistics.

5 (SRV_FORCE_NO_UNDO_LOG_SCAN)

Do not look at undo logs when starting the database: InnoDB treats even incomplete transactions as committed.

6 (SRV_FORCE_NO_LOG_REDO)

Do not do the log roll-forward in connection with recovery.

其中2是說: 如果purge operation導致CRASH,那麼這個設定就會生效.

這裡的purge operation到底是什麼樣的操作. 搜尋中............

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/703656/viewspace-1024178/,如需轉載,請註明出處,否則將追究法律責任。

相關文章