OGG-Oracle 整合模式抽取程式，REGISTER DATABASE都做了什麼？

綠茶有點甜發表於2021-04-14

Oracle模式Database

一、學習目標

有同事問OGG技術問題，OGG軟體，在oracle資料庫中，整合模式抽取程式REGISTER DATABASE，都做了什麼操作？有什麼風險？

並且提到了一個抽取程式註冊，在瞬時間併發佔用了大量的程式process,並且當時某DB process剩餘不足，最終導致DB的process佔滿，影響了DB系統！

那麼帶來一些疑問：

1）整合模式抽取程式REGISTER DATABASE，做了哪些操作？

2） REGISTER DATABASE 會在資料庫中產生大量的併發會話嗎？具體會產生大概多少會話？

3） REGISTER DATABASE 為什麼會在資料庫中佔用大量的併發？

4）根據上述疑問進行測試，根據測試結果進行反思，REGISTER DATABASE 時，是否需要注意哪些內容？

二、測試

2.1 REGISTER DATABASE，做了什麼？

思路是對OGG使用者的會話，開啟10046，並進行彙總；

dblogin xx 的時候對於DB就是一個session的連線!
> register extract ext_tb database CONTAINER(pdbodb)
2021-04-13 15:42:35  INFO    OGG-02003  Extract EXT_TB successfully registered with database at SCN 5881819.

[oracle@19test2:/home/oracle]$ cat 888.trc|grep -A3 'SQL ID'|grep SELECT |wc -l
219
[oracle@19test2:/home/oracle]$ cat 888.trc|grep -A3 'SQL ID'|grep INSERT |wc -l
99
[oracle@19test2:/home/oracle]$ cat 888.trc|grep -A3 'SQL ID'|grep UPDATE |wc -l
16
[oracle@19test2:/home/oracle]$ cat 888.trc|grep -A3 'SQL ID'|grep DELETE |wc -l
0
[oracle@19test2:/home/oracle]$ cat 888.trc|grep -A3 'SQL ID'|grep CREATE |wc -l
30
[oracle@19test2:/home/oracle]$ cat 888.trc|grep -A3 'SQL ID'|grep ALTER |wc -l
5
[oracle@19test2:/home/oracle]$ cat 888.trc|grep -A3 'SQL ID'|grep DROP |wc -l
0

可以說往SYS使用者下的很多基表insert，update了資料，建立的基本上是C##OGG使用者下的表、索引、檢視，在SYS使用者建立了個檢視，
ALTER的操作就是對C##OGG使用者下的基表新建主鍵索引！

2.2 REGISTER DATABASE 時建立了多少個session?

註冊會產生多少個Session
SYS@odb>select * from GV$RESOURCE_LIMIT where RESOURCE_NAME in('processes','sessions');
RESOURCE_N   CURRENT          MAX  INITIAL_ALLOCATION LIMIT_VALUE   CON_ID
---------- ------------------- --------------- -------------------- -------------------- ----------
processes              79              93        300                  300                    1
sessions              118             130        472                  472                    1
> dblogin userid C##OGG@T2CDB ,password SGcc_osgsac01
> register extract ext_tb database CONTAINER(pdbodb)
RESOURCE_N CURRENT_UTILIZATION MAX_UTILIZATION INITIAL_ALLOCATION   LIMIT_VALU     CON_ID
---------- ------------------- --------------- -------------------- ---------- ----------
processes                   63              93        300                  300          1
sessions                    86             130        472                  472          1

GV$RESOURCE_LIMIT自例項啟動以來的可以觀測session,process的資訊，本次測試前重啟庫！ 因此如果有較大的併發是可以較為明顯觀察到的！
那麼可以說本次測試並沒有看到明顯變化！  也就是說這種問題不是百分比能模擬重現！

2.3 REGISTER DATABASE 為什麼會在資料庫中佔用大量的併發？

同事說的案例肯定說明存在大量的併發情況？ 那麼是什麼原因？
繼續觀察上述的10046.trc日誌，觀測parallel相關的等待事件關聯的SQL資訊
SQL ID: brwf9ntmf46t0 Plan Hash: 3344361896

INSERT /*+ APPEND DISABLE_PARALLEL_DML */ INTO 
   SYS.LOGMNRG_CDEF$ (···)  SELECT ··· FROM SYS.CDEF$ AS OF SCN 5881819 CD
------- ------  -------- ---------- ---------- ---------- ----------  ----------
  db file parallel read                          19        0.00          0.00
  direct path write                               2        0.00          0.00
  direct path sync                                1        0.00          0.00

INSERT /*+ APPEND DISABLE_PARALLEL_DML */ INTO 
   SYS.LOGMNRG_TABPART$ (···    FROM SYS.TYPE$ AS OF SCN 5881819 TY
Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  LOAD AS SELECT  LOGMNRG_TYPE$ (cr=4594 pr=4591 pw=60 time=212985 us starts=1)
      4560       4560       4560   TABLE ACCESS BY INDEX ROWID BATCHED TYPE$ (cr=4594 pr=4590 pw=0 time=824302 us starts=1
 cost=4591 size=387600 card=4560)
      4560       4560       4560    INDEX FULL SCAN I_TYPE2 (cr=32 pr=30 pw=0 time=1115 us starts=1 cost=30 size=0 card=4560)
(object id 742)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file sequential read                        33        0.00          0.00
  db file parallel read                          35        0.04          0.17
  direct path write                               2        0.00          0.00
  db file scattered read                          1        0.00          0.00
  direct path sync                                1        0.00          0.00
********************************************************************************
也就是說，OGG在註冊的時候，需要內部進行執行大量的insert附加的select基表的操作！ 
並且使用了禁止dml並行，單個程式append的方式進行寫入！ 在索引回表查詢資料的時候是並行查詢，因此如果並行度過高的情況下！ 
並且session不夠用的情況下，會造成OGG使用者註冊DB後，佔用大量的session 甚至佔滿，導致DB業務受到影響！

說到底，原因就是 register database 的時候，Oracle後臺執行了很多的SQL,SQL涉及使用了並行！ 
並行的程式達到一定的程度，並且正好佔用了DB FREE PROCESS,就產生比較大的影響了！


挑選sql並行event 2個基表,10046中沒有發現alter session,table parallel hint 這個涉及Oracle內部執行，無法探索！ 常規執行查詢基表是不產生並行的！

select OWNER,TABLE_NAME,DEGREE from dba_tables where table_name in('LOGMNRG_COLTYPE$' ,'COLTYPE$');
OWNER TABLE_NAME DEGREE
-------------------- ------------------------------ ----------------------------------------
SYS COLTYPE$ 1
SYS LOGMNRG_COLTYPE$ 1
select OWNER,index_name,TABLE_NAME,DEGREE from dba_indexes where table_name in('LOGMNRG_COLTYPE$' ,'COLTYPE$');
OWNER INDEX_NAME TABLE_NAME DEGREE
-------------------- ------------------------------ ------------------------------ ----------------------------------------
SYS I_COLTYPE1 COLTYPE$ 1
SYS I_COLTYPE2 COLTYPE$ 1

2.4 反思

既然是oracle parallel的問題，那麼我們再次重新梳理一下oracle並行的相關引數！

PARALLEL_DEGREE_LIMIT
PARALLEL_DEGREE_LIMIT 限制了優化器使用的並行度，以確保並行伺服器程式不會淹沒系統。
Syntax PARALLEL_DEGREE_LIMIT = { CPU | IO | integer }
Default value CPU
CPU 最大並行度受系統中CPU數量的限制。用於計算限制的公式是PARALLEL_THREADS_PER_CPU* CPU_COUNT*可用的例項數（預設情況下，群集上所有開啟的例項，但可以使用PARALLEL_INSTANCE_GROUP或服務規範加以限制）。這是預設值。
IO 根據IO容量進行限制，並且需要執行DBMS_RESOURCE_MANAGER.CALIBRATE_IO，才能使用IO設定;
integer
A numeric value for this parameter specifies the maximum degree of parallelism the optimizer can choose for a SQL statement when automatic degree of parallelism is active. Automatic degree of parallelism is only enabled if PARALLEL_DEGREE_POLICY is set to ADAPTIVE, AUTO, or LIMITED.

PARALLEL_DEGREE_POLICY
PARALLEL_DEGREE_POLICY specifies whether automatic degree of parallelism, statement queuing, and in-memory parallel execution will be enabled.
Syntax PARALLEL_DEGREE_POLICY = { MANUAL | LIMITED | AUTO | ADAPTIVE }
Default value MANUAL
MANUAL 禁用自動並行，需要手工指定使用並行才可以
LIMITED Enables automatic degree of parallelism for some statements but statement queuing and in-memory Parallel Execution are disabled
啟用並行，並行不允許排隊！以及記憶體中不允許並行
當設定為該屬性時，自動並行特性將關閉，部分sql語句仍然可用使用，如表和索引的degree大於1的情況。
AUTO
啟用自動並行度，語句排隊和記憶體中並行執行。
ADAPTIVE
This value enables automatic degree of parallelism, statement queuing and in-memory parallel execution, similar to the AUTO value. In addition, performance feedback is enabled. Performance feedback helps to improve the degree of parallelism automatically chosen for repeated SQL statements. After the initial execution of a statement, the degree of parallelism chosen by the optimizer is compared to the degree of parallelism computed based on the actual execution performance. If they vary significantly, then the statement is marked for re-parse and the initial execution performance statistics (for example, CPU-time) are provided as feedback for subsequent executions. The optimizer uses the initial execution performance statistics to better determine a degree of parallelism for subsequent executions.

PARALLEL_EXECUTION_MESSAGE_SIZE

指定用於並行執行的訊息的大小（以前稱為並行查詢，PDML，並行恢復，複製）。
Range of values
Minimum: 2148
Maximum: 65536, but some operating systems may have a smaller value

On most platforms, the default value is as follows:
16384 bytes if COMPATIBLE is set to 11.2.0 or higher
2148 bytes if COMPATIBLE is less than 11.2.0
The default value is adequate for most applications. Larger values require a larger shared pool. Larger values result in better performance at the cost of higher memory use. For this reason, replication gets no benefit from increasing the size.

https://www.modb.pro/db/42407，這個引數還有什麼用呢？ DG追歸檔日誌應用慢，可以加大這部分的區域！ 連結中提示要加大shared pool的問題可以通過如下引數處理！

_PX_use_large_pool
FALSE
Use Large Pool as source of PX buffers

如果沒有設定大型池，則在共享池內分配空間！！！    建議設定為True
這句話殺傷性極大！  並行程式佔用較多的記憶體 >，導致佔用shared pool空間，最終導致shared pool其它元件回收記憶體，導致記憶體栓鎖爭用嚴重！

記憶體使用

SQL> select * from v$px_process_sysstat where statistic like 'Buffers%';

SQL> show parameter parallel_execution_message

ALTER SYSTEM SET "_PX_use_large_pool" = TRUE SCOPE = SPFILE;

PARALLEL_FORCE_LOCAL 建議設定為 true
預設FORCE，就是可以使用其他節點的CPU，並不是說在其他節點執行SQL，執行SQL還是本節點（即PX在本節點），但是可以使用其他節點的資源做coordinator process即QC（就是管理下面這些並行度的一個程式，一個個的並行程式叫parallel execution servers即PX），此引數最好設定為TRUE

PARALLEL_MAX_SERVERS
PARALLEL_MAX_SERVERS specifies the maximum number of parallel execution processes and parallel recovery processes for an instance. As demand increases, Oracle Database increases the number of processes from the number created at instance startup up to this value.
Default value PARALLEL_THREADS_PER_CPU * CPU_COUNT * concurrent_parallel_users * 5
Range of values 0 to 32767

PARALLEL_MIN_SERVERS is the number of parallel execution processes Oracle creates when the instance is started. These processes will be kept alive to service parallel statements.

Range of values Default value to the value of PARALLEL_MAX_SERVERS

PARALLEL_MIN_SERVERS指定例項的最小並行執行程式數。該值是在例項啟動時由Oracle建立的並行執行程式的數量

PARALLEL_MIN_DEGREE
控制通過自動並行度計算的最小並行度。預設1

PARALLEL_MIN_PERCENT
Range of values 0 to 100
使您可以指定並行執行所需的並行執行過程請求數量的最小百分比。
並行語句佇列未啟用時(PARALLEL_DEGREE_POLICY設定為manual or limited)，此引數控制並行操作的行為。
比如80個CPU、現在已經使用了60個，剩下20個（PARALLEL_SERVERS_TARGET），現在來了一個60個並行度的查詢，怎麼辦？
如果PARALLEL_MIN_PERCENT=10，就是10%可以用，就是最少使用60*10%=6個並行度，6<20,可以使用
如果PARALLEL_MIN_PERCENT=50，就是使用60*50%=30，30>20，如果PARALLEL_DEGREE_POLICY=MANUAL，則不夠，直接報錯，如果PARALLEL_DEGREE_POLICY=AUTO，則進入佇列。

parallel_min_time_threshold
sql語句執行的最小時間（在使用了該特性時），換句話說，也就是隻有當parallel_degree_policy引數設定為auto或limited時，該引數預設值為auto，即是預設為10s。
當自動並行引數為非Manual的情況下,部分的SQL語句可能執行10s後，可能自動使用並行！
By default, this parameter is set to 10 seconds. Automatic degree of parallelism is only enabled if PARALLEL_DEGREE_POLICY is set to ADAPTIVE, AUTO, or LIMITED.
If all tables referenced by a SQL statement use In-Memory Column Store (IM column store), then PARALLEL_MIN_TIME_THRESHOLD defaults to 1.

PARALLEL_SERVERS_TARGET specifies the number of parallel server processes allowed to run parallel statements before statement queuing will be used. When the parameter PARALLEL_DEGREE_POLICY is set to AUTO, Oracle will queue SQL statements that require parallel execution, if the necessary parallel server processes are not available.
PARALLEL_SERVERS_TARGE指定在使用語句排隊之前允許執行並行語句的並行伺服器程式的數量。當引數PARALLEL_DEGREE_POLICY設定為AUTO時，如果必需的並行伺服器程式不可用，Oracle將對需要並行執行的SQL語句進行排隊

The degree of parallelism used for a SQL statement can be specified at three different levels:

■Statement level - Using hints such as PARALLEL or the PARALLEL clause
■Object level - Found in the definition of the table, index, or other object. See: PARALLEL clause
■Instance level - Using default values for the instance. See: parallel_threads_per_cpu x cpu_count
The Cost-Based Optimizer determines whether to parallelize a statement and the degree of paralleism applied for a SQL statement by checking each item in this list in the order shown. Oracle first checks for a degree of parallelism specification at the statement level. If none is found, the table or index definitions are checked. If the table or index definition does not explicitly specify values for DEGREE and INSTANCES, then the default values established for the instance are used.

對於default 的情況，參考如下說明！

ALTER TABLE sales PARALLEL;
Default parallelism uses a formula to determine the DOP based on the system configuration, as in the following:
For a single instance, DOP = PARALLEL_THREADS_PER_CPU x CPU_COUNT
For an Oracle RAC configuration, DOP = PARALLEL_THREADS_PER_CPU x sum(CPU_COUNT)
By default, sum(CPU_COUNT) is the total number of CPUs in the cluster. However, if you have used Oracle RAC services to limit the number of nodes across which a parallel operation can execute, then sum(CPU_COUNT) is the total number of CPUs across the nodes belonging to that service. For example, on a 4-node Oracle RAC cluster, with each node having 8 CPU cores and no Oracle RAC services, the default DOP would be 2 x (8+8+8+8) = 64.
You can also request the default DOP by using statement level or object level parallel hints.
The default DOP specified in the PARALLEL clause of a table or an index takes effect only when PARALLEL_DEGREE_POLICY is set to MANUAL.
The default DOP algorithm is designed to use maximum resources and assumes that the operation finishes faster if it can use more resources. Default DOP targets the single-user workload and it is not recommended in a multiuser environment.
The actual runtime DOP of a SQL statement can be limited by Oracle Database Resource Manager.
預設情況下，sum(CPU_COUNT)是群集中的CPU總數。但是，如果您已使用Oracle RAC服務來限制可在其中執行並行操作的節點數，則sum(CPU_COUNT)該數是屬於該服務的節點之間的CPU總數。例如，在一個4節點的Oracle RAC叢集上，每個節點具有8個CPU核心，並且沒有Oracle RAC服務，預設DOP為2 x（8 + 8 + 8 + 8）= 64。

上面是學習並行的相關引數的解釋說明，來點乾貨！

1.並行可能引發的記憶體爭用問題

未設定並行引數，導致佔用shared pool,引發記憶體嚴重的栓鎖爭用，處理方法：

_PX_use_large_pool = true ，配置使用大池記憶體元件，並行申請記憶體從large pool申請

2.SQL語句或者物件執行，涉及表、索引等物件使用太多的並行！並行數量過高！

建議

PARALLEL_DEGREE_POLICY 設定為手工管理模式，NOT AUTO

PARALLEL_MAX_SERVERS 2倍CPU ，臨時管制 -- 不建議隨意調整，可以考慮高危操作低峰期執行，但是業務高峰期必須執行可以考慮設定限制process數量

Default value PARALLEL_THREADS_PER_CPU * CPU_COUNT * concurrent_parallel_users * 5
Range of values 0 to 32767

PARALLEL_FORCE_LOCAL true 禁止節點並行！

對於TABLE|INDEX的物件，將default設定為1 or 指定的並行度，不要使用oracle 預設的並行度計算方法！可能並行度過高！

3.操作前檢查一下可用的process,session的數量！

程式數量限制如果僅剩餘少量session還是調引數！有個千兒八百的 OGG 註冊沒啥問題！這個要看Oracle自動對其使用的並行度！

processes ||sessions， v$process,v$session

--參考連結

https://docs.oracle.com/en/database/oracle/oracle-database/12.2/vldbg/degree-parallel.html#GUID-7D86C1A1-58B2-4D66-B7EF-A5B82BADF40F
https://blog.csdn.net/iteye_4537/article/details/82132628

其它！捨棄的引數
有的時候確實佩服oracle,推出的功能你可能還沒用上，以及被它自行捨棄了，這說明在功能上有推出其它很多很多的可能性，最終有市場和使用者選擇合適的進行保留。
PARALLEL_ADAPTIVE_MULTI_USER
Note: The PARALLEL_ADAPTIVE_MULTI_USER initialization parameter is deprecated in Oracle Database 12c Release 2 (12.2.0.1) 
and may be removed in a future release. Oracle recommends that you use the parallel statement queuing feature instead.
該演算法根據查詢啟動時的系統負載自動降低請求的並行度。有效的並行度是基於預設的並行度，即表或提示中的度，再除以縮減因子。

 
關於並行的其它方面，網上有很多資料可以檢索，書本基於Oracle的SQL優化，
墨天倫，雲和恩墨-專家-楊廷琨，可以看看楊長老分享的

《並行不悖——Oracle資料庫並行的是是非非》

Mybatis與Spring整合時都做了什麼？
2019-08-02
MyBatisSpring
OGG整合抽取模式丟失歸檔處理
2022-05-07
模式
commit做了什麼？
2010-04-13
MIT
程式設計師什麼都會
2013-08-21
程式設計師
vue.use()做了什麼
2019-03-05
Vue
程式設計師都幹些什麼？
2020-11-14
程式設計師
JS中 new究竟做了什麼？
2019-07-24
JS
Oracle的commit做了什麼？
2017-07-11
OracleMIT
New運算子到底做了什麼
2017-10-22
程式設計師為什麼都穿得那麼醜
2017-11-27
程式設計師
關於整合抽取程式重啟後的現象分析
2017-12-18
Activity中setContentView做了什麼
2019-04-08
View
new操作符都做了什麼
2019-02-16
Redis SWAPDB 命令背後做了什麼
2021-05-27
Redis
javascript new關鍵字做了什麼
2017-05-27
JavaScript
重灌Ubuntu後我都做了什麼
2015-07-08
Ubuntu
你的JavaScript程式碼都經歷了什麼
2018-08-20
JavaScript
幽默：當程式設計師不想解釋他們做了什麼
2021-04-09
程式設計師
關於 AMP，Webnovel 都做了些什麼？
2019-05-14
Web
【vue原始碼解析】render到底做了什麼？
2019-06-19
Vue原始碼
原型、原型鏈、new做了什麼、繼承
2020-09-25
原型繼承
兜底機制——leader到底做了什麼？
2021-10-22
js中的new()到底做了些什麼？
2018-03-26
JS
建立表物件時，oracle做了些什麼？
2015-05-03
物件Oracle
各行業都愛用什麼程式語言開發？
2018-12-21
行業
程式設計師週末都喜歡做什麼？
2014-09-23
程式設計師
為什麼很多程式設計師都選擇跳槽？
2012-09-04
程式設計師
為什麼程式設計師在學習程式設計的時候什麼都記不住？
2019-10-12
程式設計師
Java JVM：垃圾回收（GC 在什麼時候，對什麼東西，做了什麼事情）
2016-11-25
JavaJVMGC
Vue 修改成功之後我做了什麼
2020-12-19
Vue
國民度No.1，Python到底做了什麼？
2021-01-20
Python
什麼是計劃程式代理監督程式模式？
2023-12-07
模式
Java培訓都學什麼
2021-08-30
Java
大資料都學什麼？
2020-09-01
大資料
為什麼很多程式設計師工作時都戴耳機？
2018-07-14
程式設計師
為什麼程式猿都應該學點linux命令？
2021-12-09
Linux
Java程式執行時，資料都儲存到什麼地方？
2016-10-18
Java
做了這麼多年前端，為什麼你還是不會寫業務程式碼？
2020-12-04
前端