Oracle LOB儲存知識(zt)

tolywang發表於2007-06-28

LOB儲存筆記

本文討論內部LOB段(CLOB,BLOB,NCLOB)如何選擇正確的儲存引數。

將會討論如下幾點

  1. Concept
  2. Enable/Disable storage in Row
  3. Chunk Size
  4. PCTVERSION
  5. Cache/NoCache
  6. Logging/Nologging
  7. 其它Storage 選項

Internal LOBS Concept

CLOB,NCLOB用來儲存超過4000Byte的文字大欄位,如字元文字;BLOB用來儲存二進位制欄位,如圖片,Word文件等。

在儲存方面,LOB欄位可以分為2部分,Lob Segment和Lob Index Segment。 在表的資料段中只儲存一個LOB Locator(當Disable storage in Row或者Enable storage in row的時候lob欄位大於4000byte),LOB Locator指向Lob Index 中的Lob Segment 的Chunks,再透過Lob Index Entry訪問到具體的Lob Segment Chunk.

SQL> create table t ( id int primary key,txt clob);

Table created.

SQL> col SEGMENT_NAME format a40

SQL> select segment_name,segment_type from user_segments;

SEGMENT_NAME SEGMENT_TYPE

---------------------------------------- ------------------

T TABLE

SYS_IL0000007665C00002$$ LOBINDEX

SYS_C002647 INDEX

SYS_LOB0000007665C00002$$ LOBSEGMENT

建立帶有clob欄位的表,段SYS_C002647 為維護主鍵的索引段; SYS_LOB0000007665C00002$$ 為系統生成的Lob資料段,SYS_IL0000007665C00002$$為系統生成的Lob索引段。

LOB資料段和LOB索引段具有相同的儲存引數。

SQL> set pagesize 999 long 2000

SQL> select dbms_metadata.get_ddl('TABLE','T') from dual;

DBMS_METADATA.GET_DDL('TABLE','T')

---------------------------------------------------------------

CREATE TABLE "ORACLE"."T"

( "ID" NUMBER(*,0),

"TXT" CLOB,

PRIMARY KEY ("ID")

USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255

STORAGE(INITIAL 131072 NEXT 131072 MINEXTENTS 1 MAXEXTENTS 2147483645 CTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT) ABLESPACE "TOOLS" ENABLE

) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING

STORAGE(INITIAL 131072 NEXT 131072 MINEXTENTS 1 MAXEXTENTS 2147483645 CTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)TABLESPACE "TOOLS"

LOB ("TXT") STORE AS (TABLESPACE "TOOLS" ENABLE STORAGE IN ROW CHUNK 8192 PCTVERSION 10 NOCACHE STORAGE(INITIAL 131072 NEXT 131072 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT))

如上為LOB ("TXT") 指定的儲存引數同樣適用於Lob Index Segment。

TABLESPACE "TOOLS" 為Lob data segment 和lob index segment指定了儲存表空間,因此可以為LOB段指定與表段不同的表空間。但是Lob data segment和lob index segment必須儲存在相同的表空間裡。

通常為了管理的方便性和效能原因,通常將LOB段放在不同於表段的表空間裡面。

IN ROW Clause

預設是" ENABLE STORAGE IN ROW“,表示當lob 資料小余4000byte的時候(actual maximum in-line LOB is 3964 bytes.)將lob資料和其他欄位,一起儲存在表段裡面;當lob資料大於4000byte的時候,將lob資料儲存在lob data segment中,僅僅在表段裡面儲存指向Lob Index的Lob Locator.

當指定DISABLE STOREGE IN ROW的時候,不管LOB 資料大小,都將lob資料儲存在lob data segment中,僅僅在表段裡面儲存指向Lob Index的Lob Locator.

當LOB資料不是很大,例如Descript CLOB,通常大部分Descript都較小,僅僅有少數Descript資料比較大,且該欄位訪問相對較頻繁,因此可以將該欄位設定為" ENABLE STORAGE IN ROW,儲存在表段裡,可以跟data block一起cache在db buffer cache中。

對於大的LOB資料和設定為DISABLE STOREGE IN ROW的LOB資料,都存在lob data segment中。對儲存在lob data segment中lob 資料的訪問,都需要先根據lob locator訪問lob index,再訪問lob data segment 因此,引起額外的消耗在lob index上的邏輯IO.且預設情況下,lob data segment為NOCACHE選項,表示lob data segment將不會被快取在db buffer cache中,每次訪問lob data segment都將是物理I/O,使用direct patch read。

lob index segment同in row lob一樣,會被快取到buffer cache中。Cache/NOCache對lob index無效。Cache/NOCache一會再討論。

如下,比較enable/disable storage in row的效能。

SQL> create table t(id int primary key,in_row clob,out_row clob)

2 lob(in_row) store as (enable storage in row)

3 lob(out_row) store as ( disable storage in row);

Table created.

SQL> insert into t select rownum,owner||' ' || object_name||' ' ||object_type ||' '|| status ,owner||' ' || object_name||' ' ||object_type ||' '|| status from dba_objects where rownum<100;

99 rows created.

SQL> commit;

Commit complete.

SQL> alter session set events '10046 trace name context forever,level 8';

Session altered.

SQL> declare

l_cnt number;

l_data varchar2(32765);

begin

select count(*) into l_cnt from t;

for i in 1..l_cnt

loop

select in_row into l_data from t where id=i;

select out_row into l_data from t where id=i;

end loop;

end;

/

PL/SQL procedure successfully completed.

************************************

SELECT IN_ROW FROM T WHERE ID=:B1

call count cpu elapsed disk query current rows

------- ------ -------- ---------- ---------- ---------- -

Parse 1 0.00 0.00 0 0 0 0

Execute 99 0.00 0.00 0 0 0 0

Fetch 99 0.02 0.02 0 396 0 99

------- ------ -------- ---------- ---------- -------

total 199 0.02 0.03 0 396 0 99

Misses in library cache during parse: 1

Optimizer goal: CHOOSE

Parsing user id: 31 (recursive depth: 1)

Rows Row Source Operation

------- ---------------------------------------------------

99 TABLE ACCESS BY INDEX ROWID T

99 INDEX UNIQUE SCAN SYS_C002648 (object id 7674

**************************************

SELECT OUT_ROW FROM T WHERE ID=:B1

call count cpu elapsed disk query current rows

------- ------ -------- ---------- ---------- ----------

Parse 1 0.00 0.00 0 0 0 0

Execute 99 0.01 0.01 0 0 0 0

Fetch 99 0.05 0.03 99 495 0 99

------- ------ -------- ---------- ---------- ---------- --

total 199 0.06 0.04 99 495 0 99

Misses in library cache during parse: 1

Optimizer goal: CHOOSE

Parsing user id: 31 (recursive depth: 1)

Rows Row Source Operation

------- ---------------------------------------------------

99 TABLE ACCESS BY INDEX ROWID T

99 INDEX UNIQUE SCAN SYS_C002648 (object id 7674)

Elapsed times include waiting on following events:

Event waited on Times Max. Wait Total Waited

---------------------------------------- Waited ----------

direct path read (lob) 99 0.00 0.00

************************************

如上trace結果, enable in row比disable in row使用更少的邏輯I/O和CPU時間,而且disable in row因為預設為NOCACHE選項(lob data不會被快取),因此每次訪問lob資料都要等待direct path read (lob) 事件,增加了物理I/O。同理,在insert 和update lob資料的時候,disable in row效能都要差一些,且預設情況(NOCACHE)都會導致物理I/O,等待 direct path read (lob) 事件或者direct path write(lob)事件。

因此,當lob data相對小與4000byte的時候,使用Enable storage in row,從效能上來說(I/O),是相對較好的選擇。

Chunk Size

當disable storage in row或者enable storage in row的時候,lob資料儲存在LOB data segment的Chunk裡面(CHUNK does NOT affect in-line LOBS); CHUNK大小必須是資料庫block 得整數倍,如8k,16k,32k,Etc.Oracle為每條非空lob 記錄分配至少一個Chunk,且該Chunk為該條lob 記錄專用,不能被其他的LOB記錄共享使用,lob data擴充套件的時候,以Chunk Size為單位分配空間。

表中記錄lob index locator指向lob index entries(每個entry的結構大致為Lob index locator, Chunk ID,Chunk Location),每個entry指向一個Lob Chunk. 也就是說,對於一個LOB 記錄,它記錄在Lob Index裡面的Entry數目等於它的Chunk數目。

必須根據實際情況,選擇合適的Chunk Size,太大了會浪費空間。如果設定Chunk Size 為32K,而大部分lob紀錄大小為7K,則每條lob記錄將會被分配32K的空間,且剩餘的32-7=25K的空間將會被浪費,不能被其他lob 記錄使用。如果設定Chunk Size 為8K,則每個Lob僅剩餘1K的空間,大大降低了空間浪費。

如果太小,則會導致每條Lob紀錄有很多Chunks,如果一個lob 4M,Chunk Size 8k,則一個lob分配512個Chunk,則在Lob Index中就有512個Entry,當有很多4M這樣的LOB時候,Lob Index 就會有很多Entry,當透過Lob Index訪問Lob data Segment的時候就會多出額外的邏輯I/O。
The ultimate goal is to use a CHUNK size that minimizes your ”waste”,but also efficiently stores your data.


PCTVERSION

這個引數關係到lob資料的一致讀。

LOB Index Segment產生UNDO形式和其他段類似。對於In Row儲存的lob資料,它產生UNDO的形式,也和Data block產生UNDO形式一樣。

但對於Out Row儲存的LOB資料,其產生UNDO的形式則有變化。Out Row儲存的LOB資料產生的UNDO將不會被儲存在Undo segment中,而是儲存在該LOB段中。當LOB段被更新的時候,Oracle會分配額外的新Chunk保留更新的資料,而不是在old image chunk上直接修改,回滾和一致讀的時候就可以使用到old image chunk。

觀察對LOB資料的一致讀

SQL> set serveroutput on

SQL> declare

2 l_clob clob;

3 cursor c is select out_row from t where id=1;

4 begin

5 open c;

6 update t set out_row='hello,world' where id=1;

7 commit;

8 fetch c into l_clob;

9 dbms_output.put_line('out_row clob='||l_clob);

10 close c;

11 end;

12 /

out_row clob=SYS ACCESS$ TABLE VALID

PL/SQL procedure successfully completed.

SQL> select * from t where id=1;

ID OUT_ROW

1 hello,world

既然,Out Row的LOB資料的UNDO資訊放在LOB段中,那什麼決定了UNDO資訊的保留時間哪?便有PCTVERSION決定,Oracle保留大概LOB段的百分之PCTVERSION的空間來保留LOB UNDO資訊,超過這個設定的LOB UNDO資訊將會被接下來的LOB Data覆蓋。

1. Before-images of Lobdata Segment, related to the lobdata segment and required to rollback a transaction, are created in the segment itself ;if there are nospace limitations (MAXEXTENTS, no more space in tablespace).

2. Before images, which are no longer necessary, are gradually overwritten. However, Oracle keeps PCTVERSION percent of the entire storage available for older before images.

如下metalink上的解釋:

"STORE AS ( PCTVERSION n )"

PCTVERSION can be changed after creation using:

ALTER TABLE tabname MODIFY LOB (lobname) ( PCTVERSION n );

PCTVERSION affects the reclamation of old copies of LOB data. This affects the ability to perform consistent read. If a session is attempting to use an OLD version of a LOB and that version gets overwritten (because PCTVERSION is too small) then the user will typically see the errors: ORA-01555: snapshot too old: rollback segment number with name "" too small


PCTVERSION can prevent OLD pages being used and force the segment to extend instead. Do not expect PCTVERSION to be an exact percentage of space as there is an internal fudge factor applied.

Oracle9i也引入了新的引數RETENTION來保證LOB一致讀,它的大小和意義由資料庫初始化引數UNDO_RETENTION設定,因此在使用回滾段自動管理的時候,可以使用該引數。UNDO資訊仍然保留在LOB段中,只不過增加了類似回滾段自動管理的Chunk 時間戳,設定什麼時候可以覆蓋該Chunk.

設定合適的PCTVERSION和RETENTION來保證LOB段頻繁更新時候,讀取LOB不會出現ORA-1555 Snapshot too old 錯誤。

Cache/NoCache

該選項只針對Out Row Lob 段有影響。IN ROW LOB會隨著Data Block而被快取到DB Buffer Cache中。lob index segment同in row lob一樣,會被快取到buffer cache中。

In-line LOBS are not affected by the CACHE option as they reside in the actual table block (which is typically accessed via the buffer cache any way).

Cache的選項有CACHE / CACHE READS / NOCACHE

ALTER TABLE tabname MODIFY LOB (lobname) ( CACHE );

ALTER TABLE tabname MODIFY LOB (lobname) ( CACHE READS);

ALTER TABLE tabname MODIFY LOB (lobname) ( NOCACHE );

預設為NOCACHE ,表示對讀寫LOB SEGMENT的時候採用 direct reads 和writes. LOB BLOCK不會被快取到buffer cache 中並且每次讀寫都會發生物理I/O,程式就會等待"direct path read" "direct path write" 事件,並且每次I/O都會讀寫很多BLOCK.( blocks can be read/written at a time (provided the caller is using a large enough buffer size))

設定為CACHE,表示會將LOB SEGMENT 快取在buffer cache中。從磁碟上讀取將會等待db file sequential read" 事件,並且LOB BLOCK會放在most-recently-used end of the LRU chain. 這和表的CACHE選項不同。

The CACHE options for LOB columns is different to the CACHE option for tables as CACHE_SIZE_THRESHOLD does not limit the size of LOB read into the buffer cache. This means that extreme caution is required otherwise the read of a long LOB can effectively flush the cache.

Cache選項也會影響OUT ROW LOB段產生的REDO量。NOCACHE blocks 直接從磁碟上讀寫,因此全部的BLOCK都會作為redo寫到redo log buffer. 如果設定了CACHE,則只有被修改的部分才會被寫到redo log buffer 中。

Eg: In the extreme case 'DISABLE STORAGE IN ROW NOCACHE CHUNK 32K' would write redo for the whole 32K even if the LOB was only 5 characters long. CACHE would write a redo record describing the 5 byte change (taking about 100-200 bytes).

通常如果OUT ROW LOB不是很大,或者經常訪問,可以快取到Buffer Cache中,但由於其快取機制可能會影響到其他BLOCK的正常訪問,因此可以結合BUFFER POOL,使用Keep Pool或者Recycle Pool快取OUT ROW LOB,避免其他經常被訪問的資料塊因為Buffer Cache快取了很多大的OUT ROW LOB而被重新整理到磁碟(Buffer Free Wait).

且OUT ROW LOB被快取後,將會有DBWR在後臺執行寫操作,避免了前臺程式寫LOB導致的磁碟I/O等待。

LOGGNG/NOLOGGING

該引數只有在NOCACHE環境下且對OUT ROW LOB才有效(updates to in-line LOBS are still logged as normal).

"STORE AS ( NOCACHE LOGGING )" or

"STORE AS ( NOCACHE NOLOGGING )"

This option can be changed after creation but the LOGGING / NOLOGGING attribute must be prefixed by the NOCACHE option. The CACHE option implicitly enables LOGGING. The default for this option is LOGGING.

如果OUT ROW LOB設定為NOCACHE NOLOGGING ,則對LOB DATA SEGMENT做dml操作將不產生redo logs.

當對LOB進行批次操作的時候,如用SQL*LOADER批次裝載LOB則可以使用NOLOGGING加快操作。

NOLOGGING of the LOB segment means that if you have to recover the database then sections of the LOB segment will be marked as corrupt during recovery.

If your application makes frequent small changes to NOLOGGING LOBs, then it may well be that the controlfile transactions required to update the unrecoverable SCN are actually taking a lot longer than it would take to log the redo for the LOB changes if the LOBs were changed to LOGGING. However, there is a better alternative -- namely, setting event 10359.

Event 10359 disables all updates of unrecoverable SCNs. By setting this event you can retain the performance benefit of not logging LOB changes without sustaining the performance penalty of repeated foreground controlfile transactions. The only disadvantage is that RMAN will no longer be able to report which datafiles have recently been affected by NOLOGGING operations, and so you will have to adopt a backup strategy that compensates for that.

其他Storage選項

LOB ("TXT") STORE AS (

TABLESPACE "TOOLS" ENABLE STORAGE IN ROW CHUNK 8192 PCTVERSION 10 NOCACHE

STORAGE(INITIAL 131072 NEXT 131072 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT))

語法中也包括了STORAGE,表示這些儲存引數都會對LOB Data Segment和LOB Index Segment產生影響,其作用和其他型別段一樣。

如BUFFER_POOL,可以結合BUFFER_POOL和CACHE選項來合適的快取LOB資料。

如FREELISTS,FREELIST GROUPS 如果並行使用者更新頻繁,則可以設定多個FREELISTS減少競爭。

注意LOB Data Segment和LOB Index Segment的儲存引數是一樣的。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/35489/viewspace-84719/,如需轉載,請註明出處,否則將追究法律責任。

相關文章