goldengate 中使用batchsql引數需要額外考慮的地方
What Are The Considerations In Using "Batchsql"? [ID 1060546.1] | |||||
修改時間 04-APR-2011 型別 HOWTO 狀態 PUBLISHED |
In this Document
Applies to:
Oracle GoldenGate - Version: 4.0.0 and later [Release: 4.0.0 and later ]Information in this document applies to any platform.
Solution
Issue:BatchSQL Implemantation and Limitations:
Solution Overview:
BATCHSQL is a Replicat parameter implementing a strategy that greatly increases replication speed and efficiency. It works by "batching" similar statements and applying them to the target database as a group. In optimal configurations (ones where the data consists of a large number of like statements), BATCHSQL can increase replication speed from a few hundred I/Os per second to thousands of I/Os per second.
Some statement types cannot be processed in batches and must be processed as exceptions. When BATCHSQL encounters them, it flushes everything in the batch, applies the exceptions in the normal manner, and then resumes batch processing.
Solution Details:
There are some guidelines to follow when choosing to use BATCHSQL. First, smaller row changes will show a higher gain in performance than larger row changes. BATCHSQL is also indicated for smaller tables when Oracle or ODBC is used, and for both traditional batch and OLTP processing.
BATCHSQL does not work for everything, and is not selectable at a table level.
Finally, BATCHSQL is not a good choice when there is LOB data, when there may be primary key collisions or missing row conditions. These conditions are not optimal because they are the operations treated as exceptions as described earlier.
OTHER RESTRICTIONS
* APPLYNOOPUPDATES will cause empty updates to be applied to the database. This parameter should not be used with BATCHSQL.
* Do not use NUMTHREADS for Replicat on Windows.
RECOVERY MODE AND FALLBACK
When a BATCHSQL commit fails, BATCHSQL is toggled off and the data is reprocessed under GROUPTRANSOPS rules. If the error causing BATCHSQL to fail persists, then the processing of the group of data will also fail, and the data will be reprocessed one transaction at a time. This normal-mode fallback is the most important performance liability for BATCHSQL processing. In instances where exceptions occur, operations are executed three times before success is achieved, if achieved at all, greatly reducing throughput during the time of the exception processing.
IMPLEMENTATION RELEASE
BATCHSQL was implemented in GoldenGate version 7.x on some platforms and in version 8.0.2.0 on LINUX.
There are a number of parameter options available for BATCHSQL. Consult the current GoldenGate for Windows and UNIX Reference guide for an explanation.
CURRENT ISSUES (as of v95x releases)
There have been a number of issues that have been identified with BATCHSQL. This document addresses some of those.
1. Loss of Committed Data
Teradata has a limit on the number of SQL statements that can be in use concurrently. That number is 15 and is a Teradata limit. This limit is calculated as the sum of available open cursors and cache statements.
Cursors are represented by the GoldenGate parameter MAXSQLSTATEMENTS, which controls the number of prepared SQL statements that can be used by Replicat. The value of this parameter determines the number of open cursors that Replicat maintains. The default value for this parameter is too large to be supported by Teradata. It is generally set to a value of 1. Once this value is set to 1, there remains a maximum value of 14 for cache statements. This value is represented by the CACHESTATEMENTS option of the BATCHSQL parameter. There is a further limitation in GoldenGate version 8.0.4.x in that there is no consolidated caching algorithm. Therefore, the BATCHSQL cache holds up to 13 active statements.
The usual parameter settings are:
BATCHSQL CACHESTATEMENTS 13
MAXSQLSTATEMENTS 1
If the MAXSQLSTATEMENTS value is increased, the CACHESTATEMENTS value must be decremented by the equivalent amount:
BATCHSQL CACHESTATEMENTS 10
MAXSQLSTATEMENTS 4
If you exceed these values on a LINUX machine writing by means of ODBC to Teradata, data will show as committed and appear as written in the Replicat report, but the Teradata DBMS will actually roll back data, corrupting the target. There will be no errors or warnings reported indicating this loss of data.
If you exceed these values on a Windows machine writing via ODBC to Teradata, Replicat will abend, resulting in a failed Replicat but no loss of data.
The above is true regardless of code version up to GoldenGate version 9.x.
This behavior results from the DBMS getting an error (3130 - response limit exceeded) an error the DBMS does not return to the ODBC. This has since been corrected by Teradata in ODBC versions 3.5.0.3 and above for Windows and 3.5.0.7 (not yet generally available) for LINUX.
As of GoldenGate version 9.0, the two parameters are synonymous and indicate the number of prepared SQL statements cached by the REPLICAT process for either Batch or non-batch modes. As of the v9 code, if the user is using an ODBC level below that of the Teradata fix, then either of these two parameters can be specified. One should be so that the default of 250 is not assumed and used. Still, the total value of one or both of the two parameters may not exceed 15.
2. Insertion of Garbage Data
As of GoldenGate version 9.x, a bug was introduced so that when BATCHSQL fails and falls back to normal mode processing, garbage data is inserted into the target table. This is fixed in 9.0.1.0.001. This is known to affect SQLServer and may affect DB2, Teradata, and any other ODBC-based Replicat.
3. Memory Leaks
As of GoldenGate version 9.0 for Teradata and SQL Server, certain columns can be dropped from UPDATE statements for performance reasons or because of database limitations. When BATCHSQL is enabled, this results in a memory leak. This bug was fixed in:
Version 9.0.0.9.001
Version 9.0.1.0.001
4. Floating point errors
In v9 for DB2 on z/os, there appears to be a bug in the IBM ODBC driver where double-precision floating point values are not represented correctly. A workaround was coded in:
Version 9.0.1.0.001[@more@]
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/161195/viewspace-1055038/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 系統整合銀行卡識別API需要考慮哪些引數?API
- GoldenGate HANDLECOLLISIONS引數使用說明Go
- 瞭解GoldenGate Replicat的HANDLECOLLISIONS引數Go
- 使用隱含引數testMappingSpeed排查GoldenGate抽取慢的步驟APPGo
- 我考慮的是來看考慮考慮勞福德
- 要你做一個國外的web頁面,你需要考慮哪些問題?Web
- Oracle GoldenGate常用引數詳解OracleGo
- GoldenGate的一些引數的意義Go
- 建立CI/CD流水線中的IaC前,需要考慮哪些事項?
- 2.5.11.2 FORCE LOGGING 模式需要考慮的效能問題模式
- APP功能測試中經常遇見,需要考慮到的測試點APP
- (譯)Effective Kotlin系列之遇到多個構造器引數要考慮使用構建器(二)Kotlin
- 替代 VMware ,為什麼需要重新考慮您的儲存?
- 遊戲開服導量,需要考慮的三大需求遊戲
- 選型招聘系統需要考慮的幾個要點
- 遊戲策劃設計系統時,除了考慮設計目的,還需要考慮什麼?遊戲
- 農耕博物館設計需要考慮那些方面?
- 伺服器託管需要考慮這些因素伺服器
- 建設智慧城市,需要從哪幾方面考慮?
- 開發高品質的數字貨幣交易所需要考慮哪些方面?
- 在IT專案中運用FMEA,是否需要考慮客戶的需求和反饋?
- 2023 年 MQTT Broker 選型時需要考慮的 7 個因素MQQT
- 【知識分享】多ip伺服器的租用需要考慮哪些伺服器
- 好的精益工廠佈局需要考慮哪些問題?
- 公交地鐵出行的場景,需要考慮那些測試點
- 選擇 NoSQL 資料庫需要考慮的 10 個問題SQL資料庫
- 以前端角度出發做好SEO需要考慮什麼前端
- 大資料安全保護,需要考慮哪幾方面?大資料
- equals中關於空格需要注意的地方
- 數倉選型必列入考慮的OLAP列式資料庫ClickHouse(中)資料庫
- 分散式雲的擴充套件自動化需要考慮什麼分散式套件
- 程式設計師需要重新考慮無程式碼開發的原因。程式設計師
- mysqldump常見使用場景及引數參考MySql
- Rise with SAP乘雲而上 – 企業需要考慮什麼?
- Cirium分析:航空公司需要重新考慮融資問題
- 無線覆蓋解決方案需要考慮哪些細節
- docker 配置引數參考Docker
- 採購網路損傷儀需要考慮的重點一、穩定性
- 規劃5G資料庫時需要考慮的4項要素資料庫