[20140209]行遷移和expdp匯出.txt

lfree發表於2014-02-10

[20140209]行遷移和expdp匯出.txt

前一陣子與別人聊天,談到一個系統升級expdp匯出很慢,我比較熟悉這個系統,當他說出匯出很慢的那張表的時候,
我隨口講不會這個表存在大量的行遷移吧。我記得以前學習oracle,聽別人講課,講過一句話,如果你看這個系統
的使用者模式下所有表的pctfree設定都是10的話,那麼這個系統沒有dba管理。按照這樣看,中國大部分資料庫系統
沒有dba管理。

我自己以前對行遷移還是比較重視的,當然現在變懶了。我發現許多dba也不是太重視這個問題,慢慢我對這些調整
也不重視。而且那上面提到的那張表,裡面存在一個修改過程,把一個標誌從'0'=>'1',同時在另外的備註欄位儲存
修改的日期,儲存的日期實際上以字元的形式來儲存的,這樣內容'2012-02-02 10:10:10'佔用的空間19個字元,這
樣修改記錄長度每條都要增加19個位元組。如果原來記錄長度很短的情況下,儲存在一個資料塊的記錄數會很多,發
生行遷移的行會非常多。我自己以前對這個表也很重視,總想徹底消除行遷移情況,但是我最終放棄了這個想法,
我發現要徹底消除pctfree要接近40才基本消除行遷移,但是這樣帶來另外一個副作用就是導致空間的浪費。因為
記錄並不總等你達到pctfree的設定才進行修改,因為插入與修改是互動進行的。另外的方法就是使用ALTER TABLE
MINIMIZE RECORDS_PER_BLOCK.參考如下連結:

http://blog.itpub.net/267265/viewspace-763315/

下面做一個測試看看出現大量行遷移時expdp的情況。

1.建立測試例子:
SCOTT@test> @ver
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

create table t (id number, name varchar2(20)) pctfree 0;
insert into t
with a as (select /*+ Materialize */ rownum id from dual connect by level<=1000)
select rownum,null from a x,a where rownum<=1e6;

commit ;
update t set name=lpad(id,20,'y');
commit;

--建立CHAINED_ROWS表。
--SQL> analyze table t list chained rows into chained_rows;
--注: 執行前要建立chained_rows表,最好不要放在system表空間.$ORACLE_HOME/rdbms/admin/utlchain.sql.
--truncate table chained_rows;

SCOTT@test> select count(*) from chained_rows;
  COUNT(*)
----------
   1000000
--存在大量的行遷移。

2.使用expdp匯出測試:

--CREATE OR REPLACE DIRECTORY DATA_PUMP_DIR AS '/u01/app/oracle11g/admin/test/dpdump/';
--GRANT READ, WRITE ON DIRECTORY DATA_PUMP_DIR TO EXP_FULL_DATABASE;
--GRANT READ, WRITE ON DIRECTORY DATA_PUMP_DIR TO IMP_FULL_DATABASE;
--GRANT READ, WRITE ON DIRECTORY DATA_PUMP_DIR TO SCOTT;

--exec dbms_workload_repository.create_snapshot();
expdp scott/xxxxxx directory=DATA_PUMP_DIR tables=t dumpfile=t.dmp
--exec dbms_workload_repository.create_snapshot();

--如果檢視select * from v$session_wait where wait_class<>'Idle',可以發現大量的是db file sequential read等待事件。

$ expdp scott/xxxxxx directory=DATA_PUMP_DIR tables=t dumpfile=t.dmp
Export: Release 11.2.0.3.0 - Production on Mon Feb 10 10:38:30 2014
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Starting "SCOTT"."SYS_EXPORT_TABLE_01":  scott/******** directory=DATA_PUMP_DIR tables=t dumpfile=t.dmp */
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 72 MB
Processing object type TABLE_EXPORT/TABLE/TABLE
. . exported "SCOTT"."T"                                28.59 MB 1000000 rows
Master table "SCOTT"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SCOTT.SYS_EXPORT_TABLE_01 is:
  /u01/app/oracle11g/admin/test/dpdump/t.dmp
Job "SCOTT"."SYS_EXPORT_TABLE_01" successfully completed at 10:38:55
--測試多次,大約在25秒完成。

--如果我建立一張新表tx,匯出看看。
SCOTT@test> create table tx as select * from t;
Table created.

$ expdp scott/xxxxxx directory=DATA_PUMP_DIR tables=tx  dumpfile=tx.dmp
Export: Release 11.2.0.3.0 - Production on Mon Feb 10 10:50:51 2014
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Starting "SCOTT"."SYS_EXPORT_TABLE_01":  scott/******** directory=DATA_PUMP_DIR tables=tx dumpfile=tx.dmp */
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 35 MB
Processing object type TABLE_EXPORT/TABLE/TABLE
. . exported "SCOTT"."TX"                                28.59 MB 1000000 rows
Master table "SCOTT"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SCOTT.SYS_EXPORT_TABLE_01 is:
  /u01/app/oracle11g/admin/test/dpdump/tx.dmp
Job "SCOTT"."SYS_EXPORT_TABLE_01" successfully completed at 10:51:00

--測試多次,大約在10秒完成.

3.使用strace跟蹤看看。
看到db file sequential read等待事件,如果執行多次,資料塊應該都集中在記憶體,應該看到的db file sequential read不多,那是什麼情況導
致緩慢呢?

$ strace -c -o /tmp/aa1  expdp scott/xxxxxx directory=DATA_PUMP_DIR tables=t dumpfile=t.dmp

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
36.82    0.006311          37       169         2 read
12.45    0.002134          14       158        71 open
10.89    0.001866          30        62           write
  8.61    0.001476          16        95           close
  4.56    0.000782           9        89           mmap
  3.99    0.000683         683         1           clone
  3.61    0.000618          12        53           munmap
  2.67    0.000457           7        64           fstat
  2.38    0.000408          11        38           brk
  1.70    0.000291           7        42           rt_sigaction
  1.64    0.000281           7        38           fcntl
  1.47    0.000252           7        36           lseek
  1.35    0.000232           9        27        18 stat
  1.02    0.000174           9        19           mprotect
  0.88    0.000151         151         1           execve
  0.65    0.000112          16         7           socket
  0.65    0.000111           8        14           gettimeofday
  0.58    0.000100           8        13           rt_sigprocmask
  0.54    0.000093          12         8         6 access
  0.46    0.000079          16         5         4 connect
  0.40    0.000069           7        10           uname
  0.40    0.000068          34         2           getdents64
  0.34    0.000058           7         8           getrlimit
  0.29    0.000050           7         7           getuid
  0.20    0.000035          12         3           recvmsg
  0.18    0.000030          30         1           readlink
  0.16    0.000027           9         3           getcwd
  0.15    0.000025           8         3           futex
  0.13    0.000023          12         2           pipe
  0.13    0.000022          22         1           sendto
  0.12    0.000021          11         2           bind
  0.08    0.000014           7         2           getsockname
  0.08    0.000014           7         2           setrlimit
  0.08    0.000013           7         2           times
  0.08    0.000013          13         1           _sysctl
  0.06    0.000010          10         1           ioctl
  0.04    0.000007           7         1           getppid
  0.04    0.000007           7         1           arch_prctl
  0.04    0.000007           7         1           gettid
  0.04    0.000007           7         1           time
  0.04    0.000007           7         1           set_tid_address
------ ----------- ----------- --------- --------- ----------------
100.00    0.017138                   994       101 total

$ strace -c -o /tmp/aa1 expdp scott/xxxxxx directory=DATA_PUMP_DIR tables=tx  dumpfile=tx.dmp
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
33.23    0.004392          26       169         2 read
15.27    0.002018          33        62           write
13.14    0.001737          11       158        71 open
  5.23    0.000691         691         1           clone
  4.84    0.000640           7        89           mmap
  4.72    0.000624           7        95           close
  4.35    0.000575          11        53           munmap
  2.50    0.000330           5        64           fstat
  2.38    0.000315           8        38           brk
  1.65    0.000218           5        42           rt_sigaction
  1.58    0.000209           6        38           fcntl
  1.40    0.000185           5        36           lseek
  1.31    0.000173           6        27        18 stat
  1.18    0.000156         156         1           execve
  1.01    0.000133           7        19           mprotect
  0.63    0.000083          12         7           socket
  0.62    0.000082          16         5         4 connect
  0.61    0.000081          10         8         6 access
  0.57    0.000076           5        14           gettimeofday
  0.54    0.000071           5        13           rt_sigprocmask
  0.49    0.000065          33         2           getdents64
  0.34    0.000045           5        10           uname
  0.30    0.000039           5         8           getrlimit
  0.26    0.000034           5         7           getuid
  0.25    0.000033          33         1           readlink
  0.22    0.000029          10         3           recvmsg
  0.20    0.000026          26         1           sendto
  0.17    0.000023           8         3           getcwd
  0.17    0.000022          11         2           pipe
  0.15    0.000020           7         3           futex
  0.14    0.000018           9         2           bind
  0.08    0.000011           6         2           getsockname
  0.08    0.000011          11         1           _sysctl
  0.08    0.000011           6         2           setrlimit
  0.08    0.000010           5         2           times
  0.06    0.000008           8         1           ioctl
  0.04    0.000005           5         1           arch_prctl
  0.04    0.000005           5         1           gettid
  0.04    0.000005           5         1           time
  0.04    0.000005           5         1           set_tid_address
  0.03    0.000004           4         1           getppid
------ ----------- ----------- --------- --------- ----------------
100.00    0.013218                   994       101 total

--很難看出差異在那裡?

4.常見解決方法:
透過上面對比,如果表很大,行遷移很多的情況下,時間差異還是很明顯的。
最根本的解決是檢查應用,設定大的pctfree來消除行遷移。
move 表空間以及線上重定義表或者ctas都可以解決已經出現的情況。但是非常不合理,特別是表很大的情況下。
一些索引要重建。

如何在這種情況下加快expdp操作呢?我google許多連結查詢: Row Migration expdp


--裡面都提到access methods,摘要如下:
With datapump there are 2 different access methods: EXTERNAL_TABLE and DIRECT_PATH. Usually, the datapump utility
decides on it's own which method to use. It turned out that with EXTERNAL_TABLE, the table export takes only 10 minutes
and does not perform these single-block I/O. It only appears with DIRECT_PATH.

--當我使用expdp help=y時,我發現access_method引數並不存在,難道又是某個隱含秘密?我執行如下:

$ expdp scott/xxxxxx directory=DATA_PUMP_DIR tables=t access_method=external_table dumpfile=t_good.dmp
Export: Release 11.2.0.3.0 - Production on Mon Feb 10 11:25:46 2014
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Starting "SCOTT"."SYS_EXPORT_TABLE_01":  scott/******** directory=DATA_PUMP_DIR tables=t access_method=external_table dumpfile=t_good.dmp */
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 72 MB
Processing object type TABLE_EXPORT/TABLE/TABLE
. . exported "SCOTT"."T"                                 28.59 MB 1000000 rows
Master table "SCOTT"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SCOTT.SYS_EXPORT_TABLE_01 is:
  /u01/app/oracle11g/admin/test/dpdump/t_good.dmp
Job "SCOTT"."SYS_EXPORT_TABLE_01" successfully completed at 11:25:56
--測試多次,大約在10秒完成。

5.補充:
--連結http://blog.csdn.net/tianlesoftware/article/details/6090757

Important Need-To-Know's when the parameter ACCESS_METHOD is specified for a job:

    The parameter ACCESS_METHOD is an undocumented parameter and should only be used when requested by Oracle Support.
    If the parameter is not specified, then Data Pump will automatically choose the best method to load or unload the data.
    If import Data Pump cannot choose due to conflicting restrictions, an error will be reported:
    ORA-31696: unable to export/import TABLE_DATA:"SCOTT"."EMP" using client specified AUTOMATIC method
    The parameter can only be specified when the Data Pump job is initially started (i.e. the parameter cannot be specified when the job is restarted).
    If the parameter is specified, the method of loading or unloading the data is enforced on all tables that need to be loaded or unloaded with the job.
    Enforcing a specific method may result in a slower performance of the overall Data Pump job, or errors such as:

...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
ORA-31696: unable to export/import TABLE_DATA:"SCOTT"."MY_TAB" using client specified DIRECT_PATH method ...

總結:
如果行遷移很大,expdp導致加入access_method=external_table可以加快匯出的速度。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/267265/viewspace-1078757/,如需轉載,請註明出處,否則將追究法律責任。

相關文章