【效能】Oracle表並行對邏輯遷移後查詢效能的影響

xysoul_雲龍發表於2020-04-02

原文網址 : http://blog.itpub.net/29487349/viewspace-2684157/

環境介紹：

為保持一致，客戶環境為Oracle10.2.0.5，系統環境一個為aix，一個為Linux，通過某同步軟體邏輯實時同步。

當執行以下語句時，兩個環境一個走並行，一個走索引

select count(*),sum(id) from mytest.tp1 t1 where 
 t1.d_date>=to_date('2019/03/1','yyyy/mm/dd') and 
 t1.d_date<to_date('2019/03/20','yyyy/mm/dd');

具體顯示如下：

--t1表，模擬新環境，目標端
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 580738228
----------------------------------------------------------------------------------------------------------------
| Id  | Operation	       | Name	  | Rows  | Bytes | Cost (%CPU)| Time	  |    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |	  |	1 |    13 |  3830   (1)| 00:00:46 |	   |	  |	       |
|   1 |  SORT AGGREGATE        |	  |	1 |    13 |	       |	  |	   |	  |	       |
|   2 |   PX COORDINATOR       |	  |	  |	  |	       |	  |	   |	  |	       |
|   3 |    PX SEND QC (RANDOM) | :TQ10000 |	1 |    13 |	       |	  |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE     |	  |	1 |    13 |	       |	  |  Q1,00 | PCWP |	       |
|   5 |      PX BLOCK ITERATOR |	  |  6094 | 79222 |  3830   (1)| 00:00:46 |  Q1,00 | PCWC |	       |
|*  6 |       TABLE ACCESS FULL| TP1	  |  6094 | 79222 |  3830   (1)| 00:00:46 |  Q1,00 | PCWP |	       |
----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   6 - filter("T1"."D_DATE"<TO_DATE(' 2019-03-20 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
	      "T1"."D_DATE">=TO_DATE(' 2019-03-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
19 rows selected.
--t2表,老環境
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2098670336
---------------------------------------------------------------------------------------------
| Id  | Operation		     | Name	    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	     |		    |	  1 |	 13 |  3076   (1)| 00:00:37 |
|   1 |  SORT AGGREGATE 	     |		    |	  1 |	 13 |		 |	    |
|   2 |   TABLE ACCESS BY INDEX ROWID| TP2	    |  6094 | 79222 |  3076   (1)| 00:00:37 |
|*  3 |    INDEX RANGE SCAN	     | TP2_IND_DATE |  6094 |	    |	 18   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."D_DATE">=TO_DATE(' 2019-03-01 00:00:00', 'syyyy-mm-dd
	      hh24:mi:ss') AND "T1"."D_DATE"<TO_DATE(' 2019-03-20 00:00:00', 'syyyy-mm-dd
	      hh24:mi:ss'))
17 rows selected.
--表資料，均為以下
SQL> select count(*),sum(id) from tp2 t1 where  
t1.d_date>=to_date('2019/03/1','yyyy/mm/dd') and 
t1.d_date<to_date('2019/03/20','yyyy/mm/dd');
  COUNT(*)    SUM(ID)
---------- ----------
      6278  313589166

檢查表及索引相關資訊：

--表相關資訊
OWNER	 TABLE_NAME	 OBJECT_TYPE	  size_g  STATUS   DEGREE	   LAST_ANAL CREATED
------------------- ---------------- ------------------------   ------------- --------
MYTEST	 TP1		 TABLE			 .390625 VALID	    4	       02-APR-20 02-APR-20
MYTEST	 TP2		 TABLE			 .390625 VALID	    4	       02-APR-20 02-APR-20
--索引
OWNER			       INDEX_NAME		      S.BYTES/1024/1024/1024 LAST_ANAL CLUSTERING_FACTOR CREATED
------------------------------ ------------------------------ ---------------------- --------- ----------------- ---------
MYTEST			       TP2_IND_DATE				  .002929688 02-APR-20		   50153 02-APR-20
MYTEST			       TP1_IND_DATE				  .002929688 02-APR-20		   99822 02-APR-20

通過以上，發現，主要差異在索引叢集引子上，那麼什麼會引起叢集引子高呢，索引所對應的列比較無序或散，如果不回表，只是查詢索引列沒有影響，如果需要回表，索引所對應列會反覆讀取塊中資料，大致類似如下：

如上圖所示，如果我要查詢2-5行資料，索引範圍掃描，按照順序掃描(索引三個塊，表三個塊)，2找表塊2,3找表塊1,4找表塊3,5又找表塊1. 表塊1需要被反覆讀取，造成資源尤其io資源浪費。

如果良好的表和索引結構如下

我們大概知道原因了，那麼怎麼形成的呢？ 並行查詢。正常時間列，一般按照時間順序寫入，或更新(更新可能會有出入)，寫入是順序寫入，那麼在該列上建立索引，效果相對可以。如果並行呢，就是本來一個塊上的資料，根據並行度，到目標端可能是多個塊，例如並行度為4，原來這個塊上就有4行有序資料，到了目標端變成了4個塊，以此類推。

我們看一下兩張表部分資料塊分佈情況對比，對比如下：

DATE_PER_D t1blk_count t2blk_count  t1-t2_blk
---------- ----------- ----------- ----------
2019/03/11	   314	       157	  157
2019/03/12	   305	       153	  152
2019/03/13	   339	       170	  169
2019/03/14	   358	       179	  179
2019/03/15	   322	       162	  160
2019/03/16	   345	       174	  171
2019/03/17	   337	       169	  168
2019/03/18	   341	       172	  169
2019/03/19	   343	       173	  170

通過上邊可以發現，同樣的資料，例如3月11日這一天的資料，tp1列d_date分佈在214個塊，tp2分佈在157個塊。資料庫會自動選擇相對效果更好的執行計劃，因此會出現兩個相同表，索引計劃不同的情況。

相關測試語句及環境如下：

create table tp1 (id int,vid varchar2(4000),d_date date);
create table tp2 (id int,vid varchar2(4000),d_date date);
--隨時生成日期，等相關資料
declare
  i int := 0;
begin
for i in 1 .. 100000 loop
insert into tp1 values(i,i||'_'||rpad('*','2990','*'),(select to_date('2019/'||trunc(DBMS_RANDOM.VALUE(1, 12))||'/'||trunc(DBMS_RANDOM.VALUE(1, 28)), 'yyyy/mm/dd') from dual));
end loop;
commit;
end;
--插入相關資料，並開啟並行，tp2 模擬順序寫入
insert into tp2 select * from tp1 order by d_date;
alter table tp1 parallel 4;
alter table tp2 parallel 4;
--建立索引，收集統計資訊
create index tp1_ind_date on tp1(d_date);
create index tp2_ind_date on tp2(d_date);
exec dbms_stats.gather_table_stats('MYTEST','TP1',degree => 4,cascade => true);  
exec dbms_stats.gather_table_stats('MYTEST','TP2',degree => 4,cascade => true);
--檢視塊分佈，根據rowid，rowid18位，物件號6+檔案號3+塊號6+行號3
select to_char(t1.d_date,'yyyy/mm/dd') date_per_D,count(distinct lpad(t1.rowid,15)) as "t1blk_count",count(distinct lpad(t2.rowid,15)) as "t2blk_count" ,
count(distinct lpad(t1.rowid,15))-count(distinct lpad(t2.rowid,15)) as "t1-t2_blk" from tp1 t1,tp2 t2 
where to_char(t1.d_date,'yyyy/mm/dd')=to_char(t2.d_date,'yyyy/mm/dd') and t1.d_date>=to_date('2019/03/11','yyyy/mm/dd') 
and t1.d_date<to_date('2019/03/20','yyyy/mm/dd') group by to_char(t1.d_date,'yyyy/mm/dd') order by 1;
--檢視錶大小
set lines 200
set pages 999
select t.owner,t.table_name,o.object_type,s.bytes/1024/1024/1024,t.status,t.DEGREE,t.LAST_ANALYZED,o.created 
from dba_tables t ,dba_objects o ,dba_segments s where (t.table_name=o.object_name and o.object_name=s.segment_name) 
and t.owner='MYTEST'  and t.table_name like 'TP%';
--檢視索引
select d.owner,d.index_name,s.bytes/1024/1024/1024,d.LAST_ANALYZED,d.CLUSTERING_FACTOR,o.created from dba_indexes d,dba_segments s ,dba_objects o 
where d.index_name=s.segment_name and d.index_name = o.object_name and d.owner='MYTEST'  and d.table_name like 'TP%' order by s.bytes desc;

上述只是模擬一個列的情況，一般而言，列多，索引多的情況下，至少有個索引的叢集引子是比較好的。生產環境中，表大小為250g，無lob，不是分割槽。如果表調整為分割槽，效果會好很多。

附：Oracle成本選擇示例：

--tp1 選擇並行
 Access Path: TableScan
    Cost:  13786.68  Resp: 3829.63  Degree: 0
      Cost_io: 13786.00  Cost_cpu: 25181818
      Resp_io: 3829.44  Resp_cpu: 6994949
  Access Path: index (RangeScan)
    Index: TP1_IND_DATE
    resc_io: 6102.00  resc_cpu: 45832077
    ix_sel: 0.060943  ix_sel_with_filters: 0.060943
    Cost: 6103.24  Resp: 6103.24  Degree: 1
  Best:: AccessPath: TableScan
         Cost: 3829.63  Degree: 4  Resp: 3829.63  Card: 6094.28  Bytes: 0
--tp2  選擇索引
Access Path: TableScan
    Cost:  13786.68  Resp: 3829.63  Degree: 0
      Cost_io: 13786.00  Cost_cpu: 25181818
      Resp_io: 3829.44  Resp_cpu: 6994949
  Access Path: index (RangeScan)
    Index: TP2_IND_DATE
    resc_io: 3075.00  resc_cpu: 24275478
    ix_sel: 0.060943  ix_sel_with_filters: 0.060943
    Cost: 3075.66  Resp: 3075.66  Degree: 1
  Best:: AccessPath: IndexRange
  Index: TP2_IND_DATE
         Cost: 3075.66  Degree: 1  Resp: 3075.66  Card: 6094.28  Bytes: 0

來自 “ ITPUB部落格 ” ，連結：http://blog.itpub.net/29487349/viewspace-2684157/，如需轉載，請註明出處，否則將追究法律責任。

影響Oracle標量子查詢效能的三個因素
2020-12-24
Oracle
oracle表查詢的並行度
2023-12-29
Oracle並行
磁碟排序對Oracle資料庫效能的影響PT
2022-03-22
排序Oracle資料庫
怎麼減少行鎖對效能的影響？
2022-02-27
session效能的影響，後臺 flush dirty
2019-05-29
Session
DB2 HADR對效能的影響
2018-09-13
DB2
核間遷移的影響
2024-11-10
對Oracle分割槽表進行表空間遷移並處理ORA-14511問題
2018-11-21
Oracle
Oracle優化案例-select中to_clob對效能的影響（二十一）
2019-08-19
Oracle優化
MySQL：查詢欄位數量多少對查詢效率的影響
2019-12-13
MySql
分支對程式碼效能的影響和優化
2022-03-26
優化
資料庫邏輯遷移方案
2021-05-06
資料庫
容器化對資料庫的效能有影響嗎？
2024-02-02
資料庫
伺服器IO瓶頸對MySQL效能的影響
2021-09-09
伺服器MySql
用hash cluster表提高查詢效能 (一)
2019-05-05
並行取數提升報表效能
2019-12-19
並行
解決Oracle資料庫遷移後的伺服器負載效能問題JS
2022-03-21
Oracle資料庫伺服器負載JS
Java中的Exception拋異常對效能的影響 - Baeldung
2020-07-30
JavaException
Centos8中遷移邏輯卷
2022-09-07
CentOS
.net core 拋異常對效能影響的求證之路
2022-06-11
影響mysql效能的因素都有哪些
2019-08-11
MySql
影響HTTP效能的常見因素
2019-04-27
HTTP
Java UUID生成的效能影響 – fastthread
2022-05-18
JavaUIASTthread
影響MySQL效能的硬體因
2022-03-09
MySql
影響MySQL效能的硬體因素
2022-03-08
MySql
高效能圖表LightningChart JS 遷移指南介紹
2021-03-02
GCJS
提升50%！Presto如何提升Hudi表查詢效能？
2021-05-16
REST
大型網站的HTTPS實踐（三）——HTTPS對效能的影響
2018-11-09
網站HTTP
Postgresql MVCC架構對從庫長查詢的影響
2019-02-05
SQLMVC架構
dex最佳化對Arouter查詢路徑的影響
2021-06-01
sqlcipher加密sqlite後iOS環境下FMDB效能影響
2018-04-27
加密SQLiteiOS
Sort_Buffer_Size 設定對伺服器效能的影響
2018-09-29
伺服器
關於資料庫開啟大頁對效能的影響
2024-05-27
資料庫
ClickHouse的查詢效能優勢
2023-02-22
OpenCV翻譯專案總結三——對OpenCV效能、查詢表等分析
2019-07-20
OpenCV
oracle 並行查詢時並行資源分配追蹤測試
2020-06-22
Oracle並行
找到Oracle資料庫中效能最差的查詢語句BS
2022-03-21
Oracle資料庫
Oracle 9i升級19C 邏輯遷移詳細方法（一）
2023-10-07
Oracle

【效能】Oracle表並行對邏輯遷移後查詢效能的影響

相關文章