oracle全文索引之About_INDEX_THEMES操作
Oracle的全文索引除了支援文字匹配查詢,還支援對文章涵義的查詢。這是透過ABOUT操作實現的。[@more@]
Oracle預設情況下支援英文和法文的文章涵義支援。其他語言可以新增透過使用者自定義的語言庫來實現相應的功能。
在這之前需要先安裝Companion CD 的Oracle Database 10g Products Installation Type,其中包括了Oracle Text Supplied Knowledge Bases,否則在建立含有about的索引時會出錯。Companion CD可直接在官網下載,安裝也非常簡單。
下面看看英文環境下簡單的ABOUT操作用法:
SQL> conn myuser/myuser
Connected.
SQL> CREATE TABLE T (ID NUMBER PRIMARY KEY, DOCS VARCHAR2(1000));
Table created.
SQL> INSERT INTO T VALUES (1, 'You can augment the knowledge base to define concepts and terms specific to your
industry or query application. When you do so, ABOUT queries are more precise for the added concepts.');
1 row created.
SQL> INSERT INTO T VALUES (2, 'ABOUT queries perform best when you create a theme component in your index. Theme
components are created by default for English and French.');
1 row created.
SQL> COMMIT;
Commit complete.
SQL> CREATE INDEX IND_T_DOCS ON T(DOCS) INDEXTYPE IS CTXSYS.CONTEXT;
Index created.
SQL> SELECT ID FROM T WHERE CONTAINS(DOCS, 'ABOUT(queries)') > 0;
ID
----------
1
2
SQL> SELECT ID FROM T WHERE CONTAINS(DOCS, 'queries') > 0;
ID
----------
1
2
預設的LEXER是BASIC_LEXER,因此,可以ABOUT操作已經可以起作用了,但是由於採用預設設定,ABOUT查詢的精度不高。雖然可以查詢到,但是這個單詞本身就是在文章中出現的,如果使用文章中沒有出現的單詞,ABOUT還是查詢不到。
SQL> SELECT ID FROM T WHERE CONTAINS(DOCS, 'ABOUT(definition)') > 0;
no rows selected
SQL> SELECT ID FROM T WHERE CONTAINS(DOCS, 'definition') > 0;
no rows selected
如果希望提高ABOUT操作的查詢精度,可以設定BASIC_LEXER中的INDEX_THEMES屬性的值為YES。
SQL> BEGIN
2 CTX_DDL.CREATE_PREFERENCE('TEST_ABOUT', 'BASIC_LEXER');
3 CTX_DDL.SET_ATTRIBUTE('TEST_ABOUT', 'INDEX_THEMES', 'YES');
4 CTX_DDL.SET_ATTRIBUTE('TEST_ABOUT', 'INDEX_TEXT', 'YES');
5 END;
6 /
SQL> DROP INDEX IND_T_DOCS;
Index dropped.
SQL> CREATE INDEX IND_T_DOCS ON T(DOCS) INDEXTYPE IS CTXSYS.CONTEXT
2 PARAMETERS ('LEXER TEST_ABOUT');
Index created.
SQL> SELECT ID FROM T WHERE CONTAINS(DOCS, 'ABOUT(definition)') > 0;
ID
----------
1
SQL> SELECT ID FROM T WHERE CONTAINS(DOCS, 'specifics') > 0;
no rows selected
現在已經可以看出差別來了,文章中並沒有包含definition單詞,但是Oracle自動生成了關於definition的主題。
下面再插入一些測試資料,試試主題搜尋的使用
SQL> Insert into MYUSER.T (ID, DOCS)
2 Values (3, 'Los Angeles is a city of american');
1 row created.
SQL> Insert into MYUSER.T (ID, DOCS)
2 Values (4, 'Los Angeles');
1 row created.
SQL> Insert into MYUSER.T (ID, DOCS)
2 Values (5, 'San Francisco is a city of american too');
1 row created.
SQL> Insert into MYUSER.T (ID, DOCS)
2 Values (6, 'ABOUT queries perform best when you create a theme component in your index.San Francisco is a
city of american too');
1 row created.
SQL> Insert into MYUSER.T (ID, DOCS)
2 Values (7, 'is Los Angeles');
1 row created.
SQL> commit;
Commit complete.
SQL> exec ctx_ddl.sync_index('IND_T_DOCS');
PL/SQL procedure successfully completed.
SQL> SELECT ID FROM T WHERE CONTAINS(DOCS, 'ABOUT(California)') > 0;
ID
---
3
5
6
由於Los Angeles和San Francisco是所屬California洲,所以當查California時,Los Angeles和San Francisco的資料被查出來,但為何第4和第7筆資料不符合條件?可能是因為這些資料沒有意義。
順便再簡單說一下STOPTHEME,和STOPWORD類似,Oracle提供了停用主題STOPTHEME,可以人為的遮蔽不希望被ABOUT操作查詢到的主題。
SQL> SELECT ID FROM T WHERE CONTAINS(DOCS, 'ABOUT(definition)') > 0;
ID
---
1
SQL> SELECT ID FROM T WHERE CONTAINS(DOCS, 'ABOUT(specifics)') > 0;
ID
---
1
SQL> drop index IND_T_DOCS;
Index dropped.
SQL> BEGIN
2 CTX_DDL.CREATE_STOPLIST('TEST_BASIC', 'BASIC_STOPLIST');
3 CTX_DDL.ADD_STOPTHEME('TEST_BASIC','DEFINITION');
4 END;
5 /
SQL> CREATE INDEX IND_T_DOCS ON T(DOCS) INDEXTYPE IS CTXSYS.CONTEXT
2 PARAMETERS ('LEXER TEST_ABOUT STOPLIST TEST_BASIC');
Index created.
SQL> SELECT ID FROM T WHERE CONTAINS(DOCS, 'ABOUT(definition)') > 0;
no rows selected
SQL> SELECT ID FROM T WHERE CONTAINS(DOCS, 'ABOUT(specifics)') > 0;
ID
---
1
上面的例子中,將DEFINITION新增到了TEST_BASIC停用表中,重建索引後,停用主題開始生效。
Oracle預設情況下支援英文和法文的文章涵義支援。其他語言可以新增透過使用者自定義的語言庫來實現相應的功能。
在這之前需要先安裝Companion CD 的Oracle Database 10g Products Installation Type,其中包括了Oracle Text Supplied Knowledge Bases,否則在建立含有about的索引時會出錯。Companion CD可直接在官網下載,安裝也非常簡單。
下面看看英文環境下簡單的ABOUT操作用法:
SQL> conn myuser/myuser
Connected.
SQL> CREATE TABLE T (ID NUMBER PRIMARY KEY, DOCS VARCHAR2(1000));
Table created.
SQL> INSERT INTO T VALUES (1, 'You can augment the knowledge base to define concepts and terms specific to your
industry or query application. When you do so, ABOUT queries are more precise for the added concepts.');
1 row created.
SQL> INSERT INTO T VALUES (2, 'ABOUT queries perform best when you create a theme component in your index. Theme
components are created by default for English and French.');
1 row created.
SQL> COMMIT;
Commit complete.
SQL> CREATE INDEX IND_T_DOCS ON T(DOCS) INDEXTYPE IS CTXSYS.CONTEXT;
Index created.
SQL> SELECT ID FROM T WHERE CONTAINS(DOCS, 'ABOUT(queries)') > 0;
ID
----------
1
2
SQL> SELECT ID FROM T WHERE CONTAINS(DOCS, 'queries') > 0;
ID
----------
1
2
預設的LEXER是BASIC_LEXER,因此,可以ABOUT操作已經可以起作用了,但是由於採用預設設定,ABOUT查詢的精度不高。雖然可以查詢到,但是這個單詞本身就是在文章中出現的,如果使用文章中沒有出現的單詞,ABOUT還是查詢不到。
SQL> SELECT ID FROM T WHERE CONTAINS(DOCS, 'ABOUT(definition)') > 0;
no rows selected
SQL> SELECT ID FROM T WHERE CONTAINS(DOCS, 'definition') > 0;
no rows selected
如果希望提高ABOUT操作的查詢精度,可以設定BASIC_LEXER中的INDEX_THEMES屬性的值為YES。
SQL> BEGIN
2 CTX_DDL.CREATE_PREFERENCE('TEST_ABOUT', 'BASIC_LEXER');
3 CTX_DDL.SET_ATTRIBUTE('TEST_ABOUT', 'INDEX_THEMES', 'YES');
4 CTX_DDL.SET_ATTRIBUTE('TEST_ABOUT', 'INDEX_TEXT', 'YES');
5 END;
6 /
SQL> DROP INDEX IND_T_DOCS;
Index dropped.
SQL> CREATE INDEX IND_T_DOCS ON T(DOCS) INDEXTYPE IS CTXSYS.CONTEXT
2 PARAMETERS ('LEXER TEST_ABOUT');
Index created.
SQL> SELECT ID FROM T WHERE CONTAINS(DOCS, 'ABOUT(definition)') > 0;
ID
----------
1
SQL> SELECT ID FROM T WHERE CONTAINS(DOCS, 'specifics') > 0;
no rows selected
現在已經可以看出差別來了,文章中並沒有包含definition單詞,但是Oracle自動生成了關於definition的主題。
下面再插入一些測試資料,試試主題搜尋的使用
SQL> Insert into MYUSER.T (ID, DOCS)
2 Values (3, 'Los Angeles is a city of american');
1 row created.
SQL> Insert into MYUSER.T (ID, DOCS)
2 Values (4, 'Los Angeles');
1 row created.
SQL> Insert into MYUSER.T (ID, DOCS)
2 Values (5, 'San Francisco is a city of american too');
1 row created.
SQL> Insert into MYUSER.T (ID, DOCS)
2 Values (6, 'ABOUT queries perform best when you create a theme component in your index.San Francisco is a
city of american too');
1 row created.
SQL> Insert into MYUSER.T (ID, DOCS)
2 Values (7, 'is Los Angeles');
1 row created.
SQL> commit;
Commit complete.
SQL> exec ctx_ddl.sync_index('IND_T_DOCS');
PL/SQL procedure successfully completed.
SQL> SELECT ID FROM T WHERE CONTAINS(DOCS, 'ABOUT(California)') > 0;
ID
---
3
5
6
由於Los Angeles和San Francisco是所屬California洲,所以當查California時,Los Angeles和San Francisco的資料被查出來,但為何第4和第7筆資料不符合條件?可能是因為這些資料沒有意義。
順便再簡單說一下STOPTHEME,和STOPWORD類似,Oracle提供了停用主題STOPTHEME,可以人為的遮蔽不希望被ABOUT操作查詢到的主題。
SQL> SELECT ID FROM T WHERE CONTAINS(DOCS, 'ABOUT(definition)') > 0;
ID
---
1
SQL> SELECT ID FROM T WHERE CONTAINS(DOCS, 'ABOUT(specifics)') > 0;
ID
---
1
SQL> drop index IND_T_DOCS;
Index dropped.
SQL> BEGIN
2 CTX_DDL.CREATE_STOPLIST('TEST_BASIC', 'BASIC_STOPLIST');
3 CTX_DDL.ADD_STOPTHEME('TEST_BASIC','DEFINITION');
4 END;
5 /
SQL> CREATE INDEX IND_T_DOCS ON T(DOCS) INDEXTYPE IS CTXSYS.CONTEXT
2 PARAMETERS ('LEXER TEST_ABOUT STOPLIST TEST_BASIC');
Index created.
SQL> SELECT ID FROM T WHERE CONTAINS(DOCS, 'ABOUT(definition)') > 0;
no rows selected
SQL> SELECT ID FROM T WHERE CONTAINS(DOCS, 'ABOUT(specifics)') > 0;
ID
---
1
上面的例子中,將DEFINITION新增到了TEST_BASIC停用表中,重建索引後,停用主題開始生效。
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/271283/viewspace-1022558/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- oracle全文索引之commit與DML操作Oracle索引MIT
- oracle全文索引之STORAGE PREFERENCEOracle索引
- oracle全文索引之WORDLIST PREFERENCEOracle索引
- oracle全文索引之幾個關鍵表Oracle索引
- oracle全文索引之如何實現查詢Oracle索引
- oracle全文索引之STOPLIST_4_MULTI_STOPLISTOracle索引
- oracle全文索引之STOPLIST_3_DEFAULT_STOPLISTOracle索引
- oracle全文索引之STOPLIST_2_EMPTY_STOPLISTOracle索引
- oracle全文索引之STOPLIST_1_BASIC_STOPLISTOracle索引
- oracle全文索引之LEXER_4_MULTI_LEXEROracle索引
- oracle全文索引之LEXER_3_DEFAULT_LEXEROracle索引
- oracle全文索引之LEXER_2_CHINESE_LEXEROracle索引
- oracle全文索引之LEXER_1_BASIC_LEXEROracle索引
- oracle全文索引之FILTER_4_PROCEDURE_FILTEROracle索引Filter
- oracle全文索引之FILTER_3_FORMAT_COLUMNOracle索引FilterORM
- oracle全文索引之FILTER_1_NULL_FILTEROracle索引FilterNull
- oracle全文索引之datastore_6_NESTED_DATASTOREOracle索引AST
- oracle全文索引之datastore_5_detail_datastoreOracle索引ASTAI
- oracle全文索引之datastore_4_URL_DATASTOREOracle索引AST
- oracle全文索引之datastore_3_FILE_DATASTOREOracle索引AST
- oracle全文索引之datastore_1_DIRECT_DATASTOREOracle索引AST
- Oracle:全文索引Oracle索引
- oracle全文索引之配置全文檢索環境Oracle索引
- oracle 之全文索引表的分割槽交換案例Oracle索引
- oracle全文索引之STOPLIST_ CTXCAT 索引_INDEX SETOracle索引Index
- oracle全文索引之datastore_2_MULTI_COLUMN_DATASTOREOracle索引AST
- Oracle的全文索引Oracle索引
- ZT oracle全文索引Oracle索引
- oracle全文索引之SECTION GROUP_6_PATH_SECTION_GROUPOracle索引
- oracle全文索引之SECTION GROUP_5_AUTO_SECTION_GROUPOracle索引
- oracle全文索引之SECTION GROUP_4_XML_SECTION_GROUPOracle索引XML
- oracle全文索引之SECTION GROUP_3_HTML_SECTION_GROUPOracle索引HTML
- oracle全文索引之SECTION GROUP_2_BASIC_SECTION_GROUPOracle索引
- oracle全文索引之SECTION GROUP_1_NULL_SECTION_GROUPOracle索引Null
- oracle全文索引之FILTER_2_INSO_FILTEROracle索引Filter
- oracle 全文索引的配置Oracle索引
- MongoDB之索引(全文索引)MongoDB索引
- oracle全文索引之同步和優化索引做了什麼Oracle索引優化