插入排序雜湊CLUSTER表出現ORA-7445錯誤

yangtingkun發表於2009-07-12

第一次測試排序雜湊CLUSTER就碰到ORA-7445錯誤。不過也難怪,一般新特性的bug都比較多。

 

 

資料庫版本10.2.0.3 for Windows x86

SQL> CONN YANGTK/YANGTK
已連線。
SQL> SELECT * FROM V$VERSION;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
PL/SQL Release 10.2.0.1.0 - Production
CORE    10.2.0.1.0      Production
TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production

SQL> CREATE CLUSTER C_HASH_SORT
  2  (TYPE VARCHAR2(19), CREATED DATE SORT)
  3  HASHKEYS 5806 SIZE 100;

簇已建立。

SQL> CREATE TABLE T_HASH_SORT
  2  (OBJECT_ID, OWNER, OBJECT_NAME, OBJECT_TYPE, CREATED, STATUS)
  3  CLUSTER C_HASH_SORT (OBJECT_TYPE, CREATED)
  4  AS SELECT OBJECT_ID, OWNER, OBJECT_NAME, OBJECT_TYPE, CREATED, STATUS
  5  FROM DBA_OBJECTS;

表已建立。

SQL> SELECT COUNT(*) FROM DBA_OBJECTS;

  COUNT(*)
----------
     71194

SQL> SELECT COUNT(*) FROM T_HASH_SORT;

  COUNT(*)
----------
     46461

SQL> DELETE T_HASH_SORT;

已刪除46461行。

SQL> COMMIT;

提交完成。

SQL> INSERT INTO T_HASH_SORT
  2  SELECT OBJECT_ID, OWNER, OBJECT_NAME, OBJECT_TYPE, CREATED, STATUS
  3  FROM DBA_OBJECTS;
INSERT INTO T_HASH_SORT
*
1 行出現錯誤:
ORA-03113:
通訊通道的檔案結束

利用CREATE AS SELECT方式建立HASH SORT CLUSTER TABLE後,發現表中的記錄數與DBA_OBJECTS中的不符。刪除表中記錄,執行INSERT語句,出現了ORA-3113錯誤,後臺alert檔案中出現了ORA-7445錯誤:

Errors in file e:\oracle\admin\ytk102\udump\ytk102_ora_4960.trc:
ORA-07445:
出現異常錯誤: 核心轉儲 [ACCESS_VIOLATION] [_kdtwrp+174] [PC:0x1D0E8D2] [ADDR:0x0] [UNABLE_TO_READ] []

對應的trace檔案中資訊如下:

ksedmp: internal or fatal error
ORA-07445:
出現異常錯誤: 核心轉儲 [ACCESS_VIOLATION] [_kdtwrp+174] [PC:0x1D0E8D2] [ADDR:0x0] [UNABLE_TO_READ] []
Current SQL statement for this session:
INSERT INTO T_HASH_SORT
SELECT OBJECT_ID, OWNER, OBJECT_NAME, OBJECT_TYPE, CREATED, STATUS
FROM DBA_OBJECTS
----- Call Stack Trace -----
calling              call     entry                argument values in hex     
location             type     point                (? means dubious value)    
-------------------- -------- -------------------- ----------------------------
_kdqs_create_queue_  CALLrel  _kdtwrp+0            942A1EC
id+1027                                           
_kdqs_enqueue_buf+6  CALLrel  _kdqs_create_queue_ 
87                            id+0                
_kdqs_enqueue_flush  CALLrel  _kdqs_enqueue_buf+0  7C5C4C0 32F13ACC 1 942B6F8
+1146                                              942B894 942B890
_kdtFlushBuf+4982    CALLrel  _kdqs_enqueue_flush  7C5C4C0
                              +0                  
_insflush+527        CALLrel  _kdtFlushBuf+0       7B99D9C
_insrow+480          CALLrel  _insflush+0          7B99D9C 0 1 942D0F0 0
_insdrv+2229         CALLrel  _insrow+0            7B99D9C 942D0F0 0
_inscovexe+452       CALLrel  _insdrv+0            7B99D9C
_insExecStmtExecIni  CALL???  00000000             2A699A74 2A68F50C 942D718
Engine+55                                         
_insexe+349          CALLrel  _insExecStmtExecIni  2A699A74 2A68F50C 942D718
                              Engine+0            
_opiexe+22152        CALLrel  _insexe+0            2A69986C 942D718

由於這是10g的新功能,存在bug很正常。於是找了一個10.2.0.3版本的資料庫,看看是否出現同樣的問題:

SQL> CREATE CLUSTER C_HASH_SORT
  2  (TYPE VARCHAR2(19), CREATED DATE SORT)
  3  HASHKEYS 5806 SIZE 100;

Cluster created.

SQL> CREATE TABLE T_HASH_SORT
  2  (OBJECT_ID, OWNER, OBJECT_NAME, OBJECT_TYPE, CREATED, STATUS)
  3  CLUSTER C_HASH_SORT (OBJECT_TYPE, CREATED)
  4  AS SELECT OBJECT_ID, OWNER, OBJECT_NAME, OBJECT_TYPE, CREATED, STATUS
  5  FROM DBA_OBJECTS;

Table created.

SQL> SELECT COUNT(*) FROM DBA_OBJECTS;

  COUNT(*)
----------
     71466

SQL> SELECT COUNT(*) FROM T_HASH_SORT;

  COUNT(*)
----------
     45819

SQL> TRUNCATE TABLE T_HASH_SORT;
TRUNCATE TABLE T_HASH_SORT
               *
ERROR at line 1:
ORA-03292: Table to be truncated is part of a cluster


SQL> DELETE T_HASH_SORT;

45819 rows deleted.

SQL> INSERT INTO T_HASH_SORT
  2  SELECT OBJECT_ID, OWNER, OBJECT_NAME, OBJECT_TYPE, CREATED, STATUS
  3  FROM DBA_OBJECTS;
INSERT INTO T_HASH_SORT
            *
ERROR at line 1:
ORA-00600: internal error code, arguments: [kdqs_get_new_page_reuse_link], [], [], [], [], [], [], []

問題可以重現,不過這次錯誤資訊不同了,變成了ORA-600錯誤了。

ORA-00600: internal error code, arguments: [kdqs_get_new_page_reuse_link], [], [], [], [], [], [], []
Current SQL statement for this session:
INSERT INTO T_HASH_SORT
SELECT OBJECT_ID, OWNER, OBJECT_NAME, OBJECT_TYPE, CREATED, STATUS
FROM DBA_OBJECTS
----- Call Stack Trace -----
calling              call     entry                argument values in hex     
location             type     point                (? means dubious value)    
-------------------- -------- -------------------- ----------------------------
ksedst()+31          call     ksedst1()            000000000 ? 000000001 ?
                                                   7FBFFF48C0 ? 7FBFFF4920 ?
                                                   7FBFFF4860 ? 000000000 ?
ksedmp()+610         call     ksedst()             000000000 ? 000000001 ?
                                                   7FBFFF48C0 ? 7FBFFF4920 ?
                                                   7FBFFF4860 ? 000000000 ?
ksfdmp()+21          call     ksedmp()             000000003 ? 000000001 ?
                                                   7FBFFF48C0 ? 7FBFFF4920 ?
                                                   7FBFFF4860 ? 000000000 ?
kgerinv()+161        call     ksfdmp()             000000003 ? 000000001 ?
                                                   7FBFFF48C0 ? 7FBFFF4920 ?
                                                   7FBFFF4860 ? 000000000 ?
kgeasnmierr()+163    call     kgerinv()            006618E20 ? 2A972D1168 ?
                                                   7FBFFF4920 ? 7FBFFF4860 ?
                                                   000000000 ? 000000000 ?
kdqs_check_page()+4  call     kgeasnmierr()        006618E20 ? 2A972D1168 ?
06                                                 7FBFFF4920 ? 7FBFFF4860 ?
                                                   000000000 ? 000000001 ?
kdqs_new_page_cb()+  call     kdqs_check_page()    006618E20 ? 7FBFFF5D10 ?
40                                                 7FBFFF5E00 ? 7FBFFF4860 ?
                                                   000000000 ? 000000001 ?
ktspfpblk()+395      call     kdqs_new_page_cb()   7FBFFF5D10 ? 7FBFFF56F0 ?
                                                   7FBFFF5E00 ? 7FBFFF4860 ?
                                                   000000000 ? 000000001 ?

根據錯誤資訊的描述,懷疑問題可能和HASHKEYS設定過小有關:

SQL> DROP CLUSTER C_HASH_SORT INCLUDING TABLES;

Cluster dropped.

SQL> CREATE CLUSTER C_HASH_SORT
  2  (TYPE VARCHAR2(19), CREATED DATE SORT)
  3  HASHKEYS 70000 SIZE 100;

Cluster created.

SQL> CREATE TABLE T_HASH_SORT
  2  (OBJECT_ID, OWNER, OBJECT_NAME, OBJECT_TYPE, CREATED, STATUS)
  3  CLUSTER C_HASH_SORT (OBJECT_TYPE, CREATED)
  4  AS SELECT OBJECT_ID, OWNER, OBJECT_NAME, OBJECT_TYPE, CREATED, STATUS
  5  FROM DBA_OBJECTS;

Table created.

SQL> SELECT COUNT(*) FROM DBA_OBJECTS;

  COUNT(*)
----------
     71466

SQL> SELECT COUNT(*) FROM T_HASH_SORT;

  COUNT(*)
----------
      7756

SQL> DELETE T_HASH_SORT;

7756 rows deleted.

SQL> INSERT INTO T_HASH_SORT
  2  SELECT OBJECT_ID, OWNER, OBJECT_NAME, OBJECT_TYPE, CREATED, STATUS
  3  FROM DBA_OBJECTS;
INSERT INTO T_HASH_SORT
            *
ERROR at line 1:
ORA-00600: internal error code, arguments: [kdqs_get_new_page_reuse_link], [], [], [], [], [], [], []

看來和HASHKEYS的設定沒有多大關係,而且由於HASHKEYS設定變大,CREATE TABLE AS SELECT插入的記錄數反而減少了。

檢查語法後發現CREATE TABLE的時候,沒有指定列的SORT熟悉,新增後,10.2.0.3環境錯誤依舊,倒是10.2.0.1windows環境,錯誤資訊又發生了變化:

SQL> DROP CLUSTER C_HASH_SORT INCLUDING TABLES;

簇已刪除。

SQL> CREATE CLUSTER C_HASH_SORT
  2  (TYPE VARCHAR2(19), CREATED DATE SORT)
  3  HASHKEYS 5806 SIZE 100;

簇已建立。

SQL> CREATE TABLE T_HASH_SORT
  2  (OBJECT_ID, OWNER, OBJECT_NAME, OBJECT_TYPE, CREATED SORT, STATUS)
  3  CLUSTER C_HASH_SORT (OBJECT_TYPE, CREATED)
  4  AS SELECT OBJECT_ID, OWNER, OBJECT_NAME, OBJECT_TYPE, CREATED, STATUS
  5  FROM DBA_OBJECTS;

表已建立。

SQL> SELECT COUNT(*) FROM DBA_OBJECTS;

  COUNT(*)
----------
     71197

SQL> SELECT COUNT(*) FROM T_HASH_SORT;

  COUNT(*)
----------
     46461

SQL> INSERT INTO T_HASH_SORT
  2  SELECT OBJECT_ID, OWNER, OBJECT_NAME, OBJECT_TYPE, CREATED, STATUS
  3  FROM DBA_OBJECTS;
INSERT INTO T_HASH_SORT
            *
1 行出現錯誤:
ORA-00600:
內部錯誤程式碼, 引數: [kdqs_read_link no meta-data], [], [], [], [],
[], [], []

11gSolaris上測試,發現和WINDOWS環境最後的錯誤資訊一致:ORA-00600: 內部錯誤程式碼, 引數: [kdqs_read_link no meta-data], [], [], [], [], [], [], []

嘗試簡化例子:

SQL> DROP CLUSTER C_HASH_SORT INCLUDING TABLES;

Cluster dropped.

SQL> CREATE CLUSTER C_HASH_SORT
  2  (TYPE VARCHAR2(19), CREATED DATE SORT)
  3  HASHKEYS 5806 SIZE 100;

Cluster created.

SQL> CREATE TABLE T_HASH_SORT
  2  (OBJECT_TYPE VARCHAR2(19), CREATED DATE SORT, OBJECT_NAME VARCHAR2(30))
  3  CLUSTER C_HASH_SORT (OBJECT_TYPE, CREATED);

Table created.

SQL> INSERT INTO T_HASH_SORT
  2  SELECT OBJECT_TYPE, CREATED, OBJECT_NAME
  3  FROM DBA_OBJECTS;

這次SQL沒有報錯,但是INSERT語句執行了超過2個小時,產生了幾G的歸檔後,被我手工中止了。

檢查發現這個插入的會話所有等待的時間基本上都與日誌有關,基本上不是log file switch completion就是log buffer space。但是觀察表空間的佔用,卻並發現表並沒有持續增大,而是保持原始的大小沒有變化。

看來是Oracle在處理插入的時候,由於要保證順序,轉移行的位置的過程出現了死迴圈,導致插入一直無法結束。

而如果將HASH列換成NUMBER型別,則不會出現問題:

SQL> DROP CLUSTER C_HASH_SORT INCLUDING TABLES;

Cluster dropped.

SQL> CREATE CLUSTER C_HASH_SORT
  2  (ID NUMBER, CREATED DATE SORT)
  3  HASH IS ID HASHKEYS 1000 SIZE 65;

Cluster created.

SQL> CREATE TABLE T_HASH_SORT
  2  (ID NUMBER, CREATED DATE SORT, NAME VARCHAR2(30))
  3  CLUSTER C_HASH_SORT (ID, CREATED);

Table created.

SQL> INSERT INTO T_HASH_SORT
  2  SELECT OBJECT_ID, CREATED, OBJECT_NAME
  3  FROM DBA_OBJECTS;

70730 rows created.

SQL> SELECT COUNT(*) FROM DBA_OBJECTS;

  COUNT(*)
----------
     70730

但是即使是將HASH列設定為NUMBER型別,CREATE TABLE AS SELECT方式建立的表仍然存在問題:

SQL> DROP TABLE T_HASH_SORT;

Table dropped.

SQL> CREATE TABLE T_HASH_SORT
  2  (ID, CREATED SORT, NAME)
  3  CLUSTER C_HASH_SORT (ID, CREATED)
  4  AS SELECT OBJECT_ID, CREATED, OBJECT_NAME
  5  FROM DBA_OBJECTS;

Table created.

SQL> SELECT COUNT(*) FROM T_HASH_SORT;

  COUNT(*)
----------
         0

看來HASH SORT CLUSTER的問題還真不是一點半點,從10gr1推出的新特性,居然在11g中還存在這麼多的問題,看來基本上這個功能沒有什麼人使用。不過Oracle的測試也夠差勁的,這麼多的問題居然就釋出出來了。

 

 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/4227/viewspace-608928/,如需轉載,請註明出處,否則將追究法律責任。

相關文章