Raw Device vs File System

tengrid發表於2010-11-09

原文

1, 在文中,TOM詳細解釋了 filesystem與raw device的區別及我們常用的四個儲存方式的區別:
1)  Cooked" operating system (OS) file systems
2)Raw partitions
3) Automatic Storage Management (ASM)
4) Clustered file system

(其實還有NFS, 2004年淘寶從mysql移植到oracle時用,netapp的NFS裝置做工RAC)

並說明了raw device並不一定比filesystem效能要好, it depend!


2. 最佳化的焦點是減少IO次數,一般是從應用設計入手,比如多用cache, 批次提交等, 然後才是將IO分散. (strip,mirror)

3. 建議用ASM, 如果不能用ASM ,則用filesystem (帶DIRECTIO選項), 不建議直接用raw devices

4,  不能用SQL語句的效能與dd測試作對比.
     如果一定要對比的話, 最相近的是 insert append (direct path write).
    最好是用是模擬ORACLE讀寫協議的orion工具測試儲存效能,再與dd命令測試出來的對比. 文中TOM及kevin詳細解釋了原因. 並解釋了提倡使用raw( 在ASM中使用)的好處

 ASM是結合了filesystem的易管理性有raw的效能.

ASM comes with the database
ASM doesn't require additional software to be purchased.
ASM does database stuff only, it is optimized to be a database file system and nothing else
ASM provides clustered access without any additional software to be purchased
ASM provides the DBA with management features that the DBA should have - not the "SA", the DBA

5, 減少LIO's 往往會減少PIO's

6, 不建議使用SSD, 貴,並且可能從其它渠道獲得效能的提升,抵銷它的優勢
   實際上,幾年前看過一份公司新技術評測部門測試華為SSD的報告, SSD只適合某些IO特性. 可能少批次分是可以的, 據說百度在大量使用.

 

7, 客戶端是不會等待下述幾個事件,會等待log file sync
but clients do NOT wait for db file parallel write, log file parallel write, control file parallel
write.

clients do wait for log file sync, every time you issue COMMIT. 

8, 建議除了Oracle層面對controfile, redo作鏡外外,在硬體層也要mirror,比如放raid10上

9, 找資料的順序, 注意11gR2中出現了flash cache (次序)
A read from the buffer cache.

It would be preferable to find the data we need in:

a) the buffer cache
b) the flash cache (new in 11g Release 2)
c) the OS filesystem cache
d) on disk

in that order. Now, if you make the buffer cache larger - then we'll never get to (c) and (c) is much much slower than (a)

(a) would be preferred. So, make the cache large enough to account for the secondary SGA effect when you switch from cooked to raw (ASM would be preferred actually, it is raw, but easier/more flexible to manage database stuff with)

10, 續上,查詢資料順序 (讀的邏輯)
Assuming no flashcache (new 11g R2 feature), we

a) take the DBA (data block address, (block#, file#) and hash it.
b) we search a list in the cache identified by that hash for this block
c) if we don't find it in cache we do physical IO. Physical IO might come from OS filesystem cache or not, we don't know, we don't care, we don't influence that at all.
d) we put block into that list in the cache and return it.



TOM廣泛精準的知識,不愧為Oracle'百科全書'

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/94384/viewspace-677828/,如需轉載,請註明出處,否則將追究法律責任。

相關文章