oracle全文索引之STOPLIST_ CTXCAT 索引_INDEX SET

jolly10發表於2009-05-27
這篇文章介紹的是ORACLE CTXCAT索引的INDEX SET,也是CTXCAT索引特有的屬性。[@more@]
CTXCAT 索引是CONTEXT索引的簡化版,CTXCAT索引支援的PREFERENCE包括:LEXER、STOPLIST、WORDLIST和STORAGE 引數。不支援其他的引數如:DATASTORE、FILTER、SECTION GROUP。雖然支援LEXER但不支援THEME查詢,而且不支援FORMAT、CHARSET和LANGUAGE列,另外不支援表和索引分割槽。

CTXCAT 索引僅僅包含了CONTEXT索引的部分內容,但是CTXCAT索引有其自身的優點。其中最突出的優點就是支援DML同步。CONTEXT索引由於結構過於複雜,且索引的資料量一般較大,因此CONTEXT索引並不是自動同步的。而CTXCAT索引是自動同步的,當發生了DML修改時,Oracle會自動同步CTXCAT索引,降低了索引的維護成本。

CTXCAT 索引的另外一個優點就是這裡要介紹的INDEX SET屬性,這也是CTXCAT索引特有的屬性。簡單的說,CTXCAT可以建立一個索引集。可以把一些經常與CTXCAT查詢組合使用的查詢列的索引新增到索引集中。比如,如果在查詢文章內容的同時,經常需要查詢文章的作者、標題或建立時間等資訊,則可以將這些資訊列的索引新增到索引集中,Oracle 可以將這些查詢封裝到CATSEARCH操作中,從而提高全文索引的效率。

下面看一個簡單的例子:

[oracle@rhel140 ~]$ sqlplus myuser/myuser

SQL*Plus: Release 10.2.0.1.0 - Production on Sun May 24 23:19:47 2009

Copyright (c) 1982, 2005, Oracle. All rights reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> CREATE TABLE T (ID NUMBER, TITLE VARCHAR2(256), CREATED DATE, DOCS VARCHAR2(4000));

Table created.

SQL> INSERT INTO T VALUES (1, 'ORACLE TEXT REFERENCE', TO_DATE('200203', 'YYYYMM'),
2 'This manual provides reference information for Oracle Text. Use it as a reference for
3 creating Oracle Text indexes, for issuing Oracle Text queries, for presenting
4 documents, and for using the Oracle Text PL/SQL packages.');

1 row created.

SQL> INSERT INTO T VALUES (2, 'ORACLE TEXT APPLICATION DEVELOPER''S GUIDE',
2 TO_DATE('200203', 'YYYYMM'),
3 'This guide explains how to build query applications with Oracle Text. This preface
4 contains these topics:');

1 row created.

SQL> INSERT INTO T VALUES (3, 'ORACLE SQL REFERENCE', TO_DATE('200210', 'YYYYMM'),
2 'This reference contains a complete description of the Structured Query Language
3 (SQL) used to manage information in an Oracle database. Oracle SQL is a superset
4 of the American National Standards Institute (ANSI) and the International
5 Standards Organization (ISO) SQL99 standard.');

1 row created.

SQL> COMMIT;

Commit complete.

SQL> CREATE INDEX IND_T_DOCS ON T (DOCS) INDEXTYPE IS CTXSYS.CTXCAT;

Index created.

SQL> SET AUTOT ON EXP
SQL> SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', NULL) > 0;

ID
----------
1
3


Execution Plan
----------------------------------------------------------
Plan hash value: 3282600506

------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2027 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 2027 | 2 (0)| 00:00:01 |
|* 2 | DOMAIN INDEX | IND_T_DOCS | | | | |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CTXSYS"."CATSEARCH"("DOCS",'SQL',NULL)>0)

Note
-----
- dynamic sampling used for this statement


SQL> SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', NULL) > 0
2 AND CREATED = TO_DATE('200203', 'YYYYMM');

ID
----------
1


Execution Plan
----------------------------------------------------------
Plan hash value: 3282600506

------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2036 | 2 (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 2036 | 2 (0)| 00:00:01 |
|* 2 | DOMAIN INDEX | IND_T_DOCS | | | | |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("CREATED"=TO_DATE('2002-03-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
2 - access("CTXSYS"."CATSEARCH"("DOCS",'SQL',NULL)>0)

Note
-----
- dynamic sampling used for this statement


SQL> CREATE INDEX IND_T_CREATED ON T (CREATED);

Index created.

SQL> SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', NULL) > 0
2 AND CREATED = TO_DATE('200203', 'YYYYMM');

ID
----------
1


Execution Plan
----------------------------------------------------------
Plan hash value: 2040116302

--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2036 | 4 (25)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID | T | 1 | 2036 | 4 (25)| 00:00:01 |
| 2 | BITMAP CONVERSION TO ROWIDS | | | | | |
| 3 | BITMAP AND | | | | | |
| 4 | BITMAP CONVERSION FROM ROWIDS| | | | | |
|* 5 | INDEX RANGE SCAN | IND_T_CREATED | 2 | | 1 (0)| 00:00:01 |
| 6 | BITMAP CONVERSION FROM ROWIDS| | | | | |
| 7 | SORT ORDER BY | | | | | |
|* 8 | DOMAIN INDEX | IND_T_DOCS | 2 | | | |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

5 - access("CREATED"=TO_DATE('2002-03-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
8 - access("CTXSYS"."CATSEARCH"("DOCS",'SQL',NULL)>0)

Note
-----
- dynamic sampling used for this statement


上面的查詢不僅包括了對DOCS列的全文索引查詢還包括了對CREATED列的查詢。Oracle採用了先將BTREE索引轉化為BITMAP索引,在進行BITMAP索引的AND的方法進行了處理。而如果採用了INDEX SET的方法,則不需要如此複雜的轉化:

SQL> DROP INDEX IND_T_DOCS;

Index dropped.

SQL> DROP INDEX IND_T_CREATED;

Index dropped.


SQL> BEGIN
2 CTX_DDL.CREATE_INDEX_SET('TEST_INDEXSET');
3 CTX_DDL.ADD_INDEX('TEST_INDEXSET', 'CREATED');
4 END;
5 /

PL/SQL procedure successfully completed.

SQL> CREATE INDEX IND_T_DOCS ON T (DOCS) INDEXTYPE IS CTXSYS.CTXCAT
2 PARAMETERS ('INDEX SET TEST_INDEXSET');

Index created.

SQL> SET AUTOT ON EXP
SQL> SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', 'CREATED = TO_DATE(''200203'', ''YYYYMM'')') > 0;

ID
----------
1


Execution Plan
----------------------------------------------------------
Plan hash value: 3282600506

------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2027 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 2027 | 2 (0)| 00:00:01 |
|* 2 | DOMAIN INDEX | IND_T_DOCS | | | | |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CTXSYS"."CATSEARCH"("DOCS",'SQL','CREATED = TO_DATE(''200203'',
''YYYYMM'')')>0)

Note
-----
- dynamic sampling used for this statement


處理支援結構性查詢條件外,使用INDEX SET索引集還可以支援索引列的排序操作

SQL> SET AUTOT OFF
SQL> SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', NULL) > 0;

ID
----------
1
3

SQL> SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', 'ORDER BY CREATED DESC') > 0;

ID
----------
3
1

這種將索引新增到索引集的操作也有一定的限制,比如結構性查詢的操作只能包含大於、等於、小於、IN、BETWEEN AND操作。另外,加入到索引集中索引列的大小也有限制,對於VARCHAR2和CHAR型別,大小不能超過30字元,否則會報錯:

用了不等於操作

SQL> SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', 'CREATED != TO_DATE(''200203'', ''YYYYMM'')') >0;
SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', 'CREATED != TO_DATE(''200203'', ''YYYYMM'')') >0
*
ERROR at line 1:
ORA-29902: error in executing ODCIIndexStart() routine
ORA-20000: Oracle Text error:
DRG-10844: index cannot execute this structured predicate efficiently
DRG-10845: column CREATED ! is not indexed


建立超過30個字元的索引

SQL> DROP INDEX IND_T_DOCS;

Index dropped.

SQL> exec CTX_DDL.ADD_INDEX('TEST_INDEXSET', 'TITLE');

PL/SQL procedure successfully completed.

SQL> CREATE INDEX IND_T_DOCS ON T (DOCS) INDEXTYPE IS CTXSYS.CTXCAT
2 PARAMETERS ('INDEX SET TEST_INDEXSET');
CREATE INDEX IND_T_DOCS ON T (DOCS) INDEXTYPE IS CTXSYS.CTXCAT
*
ERROR at line 1:
ORA-29855: error occurred in the execution of ODCIINDEXCREATE routine
ORA-20000: Oracle Text error:
DRG-12306: column TITLE is too long for index set columns
ORA-06512: at "CTXSYS.DRUE", line 160
ORA-06512: at "CTXSYS.CATINDEXMETHODS", line 97


SQL> select id,length(title) from t;

ID LENGTH(TITLE)
---------- -------------
1 21
2 41
3 20


SQL> update t set title=substr(title,1,30) where id=2;
update t set title=substr(title,1,30) where id=2
*
ERROR at line 1:
ORA-29861: domain index is marked LOADING/FAILED/UNUSABLE


SQL> drop index IND_T_DOCS;

Index dropped.

SQL> update t set title=substr(title,1,30) where id=2;

1 row updated.

SQL> commit;

Commit complete.

SQL> ALTER TABLE MYUSER.T MODIFY(TITLE VARCHAR2(30 BYTE));

Table altered.

SQL> CREATE INDEX IND_T_DOCS ON T (DOCS) INDEXTYPE IS CTXSYS.CTXCAT
2 PARAMETERS ('INDEX SET TEST_INDEXSET');

Index created.

將欄位長度改為varchar2(30)則可以建立

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/271283/viewspace-1022505/,如需轉載,請註明出處,否則將追究法律責任。

相關文章