goldengate 中使用batchsql引數需要額外考慮的地方

msdnchina發表於2011-09-13
What Are The Considerations In Using "Batchsql"? [ID 1060546.1]

修改時間 04-APR-2011 型別 HOWTO 狀態 PUBLISHED

In this Document


Applies to:

Oracle GoldenGate - Version: 4.0.0 and later [Release: 4.0.0 and later ]
Information in this document applies to any platform.

Solution

Issue:
BatchSQL Implemantation and Limitations:

Solution Overview:
BATCHSQL is a Replicat parameter implementing a strategy that greatly increases replication speed and efficiency. It works by "batching" similar statements and applying them to the target database as a group. In optimal configurations (ones where the data consists of a large number of like statements), BATCHSQL can increase replication speed from a few hundred I/Os per second to thousands of I/Os per second.

Some statement types cannot be processed in batches and must be processed as exceptions. When BATCHSQL encounters them, it flushes everything in the batch, applies the exceptions in the normal manner, and then resumes batch processing.

Solution Details:
There are some guidelines to follow when choosing to use BATCHSQL. First, smaller row changes will show a higher gain in performance than larger row changes. BATCHSQL is also indicated for smaller tables when Oracle or ODBC is used, and for both traditional batch and OLTP processing.

BATCHSQL does not work for everything, and is not selectable at a table level.

Finally, BATCHSQL is not a good choice when there is LOB data, when there may be primary key collisions or missing row conditions. These conditions are not optimal because they are the operations treated as exceptions as described earlier.


OTHER RESTRICTIONS
* APPLYNOOPUPDATES will cause empty updates to be applied to the database. This parameter should not be used with BATCHSQL.


* Do not use NUMTHREADS for Replicat on Windows.


RECOVERY MODE AND FALLBACK
When a BATCHSQL commit fails, BATCHSQL is toggled off and the data is reprocessed under GROUPTRANSOPS rules. If the error causing BATCHSQL to fail persists, then the processing of the group of data will also fail, and the data will be reprocessed one transaction at a time. This normal-mode fallback is the most important performance liability for BATCHSQL processing. In instances where exceptions occur, operations are executed three times before success is achieved, if achieved at all, greatly reducing throughput during the time of the exception processing.


IMPLEMENTATION RELEASE
BATCHSQL was implemented in GoldenGate version 7.x on some platforms and in version 8.0.2.0 on LINUX.

There are a number of parameter options available for BATCHSQL. Consult the current GoldenGate for Windows and UNIX Reference guide for an explanation.


CURRENT ISSUES (as of v95x releases)
There have been a number of issues that have been identified with BATCHSQL. This document addresses some of those.

1. Loss of Committed Data

Teradata has a limit on the number of SQL statements that can be in use concurrently. That number is 15 and is a Teradata limit. This limit is calculated as the sum of available open cursors and cache statements.

Cursors are represented by the GoldenGate parameter MAXSQLSTATEMENTS, which controls the number of prepared SQL statements that can be used by Replicat. The value of this parameter determines the number of open cursors that Replicat maintains. The default value for this parameter is too large to be supported by Teradata. It is generally set to a value of 1. Once this value is set to 1, there remains a maximum value of 14 for cache statements. This value is represented by the CACHESTATEMENTS option of the BATCHSQL parameter. There is a further limitation in GoldenGate version 8.0.4.x in that there is no consolidated caching algorithm. Therefore, the BATCHSQL cache holds up to 13 active statements.

The usual parameter settings are:
BATCHSQL CACHESTATEMENTS 13
MAXSQLSTATEMENTS 1

If the MAXSQLSTATEMENTS value is increased, the CACHESTATEMENTS value must be decremented by the equivalent amount:

BATCHSQL CACHESTATEMENTS 10
MAXSQLSTATEMENTS 4

If you exceed these values on a LINUX machine writing by means of ODBC to Teradata, data will show as committed and appear as written in the Replicat report, but the Teradata DBMS will actually roll back data, corrupting the target. There will be no errors or warnings reported indicating this loss of data.

If you exceed these values on a Windows machine writing via ODBC to Teradata, Replicat will abend, resulting in a failed Replicat but no loss of data.

The above is true regardless of code version up to GoldenGate version 9.x.

This behavior results from the DBMS getting an error (3130 - response limit exceeded) an error the DBMS does not return to the ODBC. This has since been corrected by Teradata in ODBC versions 3.5.0.3 and above for Windows and 3.5.0.7 (not yet generally available) for LINUX.

As of GoldenGate version 9.0, the two parameters are synonymous and indicate the number of prepared SQL statements cached by the REPLICAT process for either Batch or non-batch modes. As of the v9 code, if the user is using an ODBC level below that of the Teradata fix, then either of these two parameters can be specified. One should be so that the default of 250 is not assumed and used. Still, the total value of one or both of the two parameters may not exceed 15.

2. Insertion of Garbage Data
As of GoldenGate version 9.x, a bug was introduced so that when BATCHSQL fails and falls back to normal mode processing, garbage data is inserted into the target table. This is fixed in 9.0.1.0.001. This is known to affect SQLServer and may affect DB2, Teradata, and any other ODBC-based Replicat.

3. Memory Leaks
As of GoldenGate version 9.0 for Teradata and SQL Server, certain columns can be dropped from UPDATE statements for performance reasons or because of database limitations. When BATCHSQL is enabled, this results in a memory leak. This bug was fixed in:

Version 9.0.0.9.001
Version 9.0.1.0.001

4. Floating point errors
In v9 for DB2 on z/os, there appears to be a bug in the IBM ODBC driver where double-precision floating point values are not represented correctly. A workaround was coded in:
Version 9.0.1.0.001[@more@]

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/161195/viewspace-1055038/,如需轉載,請註明出處,否則將追究法律責任。

相關文章