[20181026]12c Attribute Clustering特性.txt

lfree發表於2018-10-26

[20181026]12c Attribute Clustering特性.txt

--//作者很早的演示,連結:
--//我當時的測試環境12.1.0.1,還不支援這個特性,一直沒有機會測試:
--//我當時想到底有多少人會使用這個特性整理組織表資料.

--//首先簡單介紹:
Attribute Clustering實際上按照特定的欄位組織表,使相同的值聚集起來,總的來講就是減少索引的聚集因子.

One of the really cool new features introduced in 12.1.0.2 is Attribute Clustering. This new table based attribute
allows you to very easily cluster data in close physical proximity based on the content of specific columns.

--//直接拿作者的例子測試.

1.環境:
SCOTT@test01p> @ ver1
PORT_STRING                    VERSION        BANNER                                                                               CON_ID
------------------------------ -------------- -------------------------------------------------------------------------------- ----------
IBMPC/WIN_NT64-9.1.0           12.2.0.1.0     Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production              0

2.測試:
SCOTT@test01p> create table ziggy (id number, code number, name varchar2(30));
Table created.

SCOTT@test01p> insert into ziggy select rownum, mod(rownum,100), 'DAVID BOWIE' from dual connect by level <= 200000;
200000 rows created.
--//我減少記錄數200000.
SCOTT@test01p> commit;
Commit complete.

SCOTT@test01p> exec dbms_stats.gather_table_stats(ownname=>user, tabname=>'ZIGGY', estimate_percent=>null, method_opt=>'FOR ALL COLUMNS SIZE 1');
PL/SQL procedure successfully completed.

SCOTT@test01p> create index ziggy_code_i on ziggy(code);
Index created.

SCOTT@test01p> select index_name, clustering_factor, num_rows from user_indexes where index_name='ZIGGY_CODE_I';
INDEX_NAME           CLUSTERING_FACTOR   NUM_ROWS
-------------------- ----------------- ----------
ZIGGY_CODE_I                     68878     200000

--//說明這樣建立的索引(欄位code)的clustering_factor會很高.

3.執行sql語句看看:
SCOTT@test01p> alter session set statistics_level=all;
Session altered.

SCOTT@test01p> set FEEDBACK only
SCOTT@test01p> select * from ziggy where code = 42;
        ID       CODE NAME
---------- ---------- --------------------

2000 rows selected.

SCOTT@test01p> set FEEDBACK 6
SCOTT@test01p> @ dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  9svg0v4s134jh, child number 0
-------------------------------------
select * from ziggy where code = 42
Plan hash value: 2421001569
---------------------------------------------------------------------------------------------------------------------
| Id  | Operation         | Name  | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |      1 |        |       |   206 (100)|          |   2000 |00:00:00.01 |     735 |
|*  1 |  TABLE ACCESS FULL| ZIGGY |      1 |   2000 | 40000 |   206   (1)| 00:00:01 |   2000 |00:00:00.01 |     735 |
---------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$1 / ZIGGY@SEL$1
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("CODE"=42)

--//可以發現群集因子太高了,oracle選擇全表掃描。

SCOTT@test01p> set FEEDBACK only
SCOTT@test01p> select /*+ index (ziggy, ziggy_code_i) */ * from ziggy where code = 42;
        ID       CODE NAME
---------- ---------- --------------------

2000 rows selected.

SCOTT@test01p> set FEEDBACK 6

SCOTT@test01p> @ dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  568j8hhfby3r4, child number 0
-------------------------------------
select /*+ index (ziggy, ziggy_code_i) */ * from ziggy where code = 42
Plan hash value: 3294205578
-------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name         | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |              |      1 |        |       |   693 (100)|          |   2000 |00:00:00.16 |     712 |      8 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| ZIGGY        |      1 |   2000 | 40000 |   693   (0)| 00:00:01 |   2000 |00:00:00.16 |     712 |      8 |
|*  2 |   INDEX RANGE SCAN                  | ZIGGY_CODE_I |      1 |   2000 |       |     4   (0)| 00:00:01 |   2000 |00:00:00.16 |      16 |      8 |
-------------------------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$1 / ZIGGY@SEL$1
   2 - SEL$1 / ZIGGY@SEL$1
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("CODE"=42)

--//我建立的資料集合偏小,buffers差別不大。但是cost=693比全表掃描206多。

4.建立新表具有Attribute Clustering.

SCOTT@test01p> create table ziggy2 (id number, code number, name varchar2(30)) clustering by linear order (code) without materialized zonemap;
Table created.

--//連結的說明:
The CLUSTERING BY LINEAR ORDER clause orders data in the table based on the specified columns, in this case the CODE
column. Up to 10 columns can be included using this particular technique (there are other attribute clustering options
which I'll again cover in later articles, yes I'll be writing quite a few new articles) :) WITHOUT MATERIALIZED ZONEMAP
means I don't want to create these new Zone Maps index structures at this stage which could potentially reduce the
amount of table storage needed to be accessed (again, I'll discuss these at another time).

--//實際上CLUSTERING BY LINEAR ORDER就是建立表按照code組織.欄位最多包括10個.
--//按照作著介紹必須採用直接路徑插入匯入資料才有attribute clustering或者選擇重新組織表.

SCOTT@test01p> insert /*+ append */ into ziggy2 select * from ziggy;
200000 rows created.

SCOTT@test01p> @ dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  0arqdyc9vznpg, child number 0
-------------------------------------
insert /*+ append */ into ziggy2 select * from ziggy
Plan hash value: 1975011999
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name   | Starts | E-Rows |E-Bytes|E-Temp | Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT                 |        |      1 |        |       |       |  1425 (100)|          |      0 |00:00:00.62 |    3691 |     16 |    691 |       |       |          |
|   1 |  LOAD AS SELECT                  | ZIGGY2 |      1 |        |       |       |            |          |      0 |00:00:00.62 |    3691 |     16 |    691 |  2068K|  2068K| 2068K (0)|
|   2 |   OPTIMIZER STATISTICS GATHERING |        |      1 |    200K|  3906K|       |  1425   (1)| 00:00:01 |    200K|00:00:00.49 |    2395 |     16 |      0 |   256K|   256K|          |
|   3 |    SORT ORDER BY                 |        |      1 |    200K|  3906K|  6288K|  1425   (1)| 00:00:01 |    200K|00:00:00.08 |     725 |      0 |      0 |  8912K|  1165K| 7921K (0)|
|   4 |     TABLE ACCESS FULL            | ZIGGY  |      1 |    200K|  3906K|       |   206   (1)| 00:00:01 |    200K|00:00:00.01 |     725 |      0 |      0 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$1
   4 - SEL$1 / ZIGGY@SEL$1
--//注意看id=3的執行計劃SORT ORDER BY,隱含排序code欄位匯入表.
SCOTT@test01p> commit;
Commit complete.

SCOTT@test01p> create index ziggy2_code_i on ziggy2(code);
Index created.

SCOTT@test01p> exec dbms_stats.gather_table_stats(ownname=>user, tabname=>'ZIGGY2',estimate_percent=>null, method_opt=>'FOR ALL COLUMNS SIZE 1');
PL/SQL procedure successfully completed.

SCOTT@test01p> select index_name, clustering_factor, num_rows from user_indexes where index_name='ZIGGY2_CODE_I';
INDEX_NAME           CLUSTERING_FACTOR   NUM_ROWS
-------------------- ----------------- ----------
ZIGGY2_CODE_I                      691     200000

--//前面CLUSTERING_FACTOR=68878,而現在表ZIGGY2才691。

SCOTT@test01p> set FEEDBACK only
SCOTT@test01p> select * from ziggy2 where code=42;
        ID       CODE NAME
---------- ---------- --------------------

2000 rows selected.

SCOTT@test01p> set FEEDBACK 6
SCOTT@test01p> @ dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  0vjk3yq2ks48x, child number 0
-------------------------------------
select * from ziggy2 where code=42
Plan hash value: 16801974
-----------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name          | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |               |      1 |        |       |    11 (100)|          |   2000 |00:00:00.01 |      34 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| ZIGGY2        |      1 |   2000 | 40000 |    11   (0)| 00:00:01 |   2000 |00:00:00.01 |      34 |
|*  2 |   INDEX RANGE SCAN                  | ZIGGY2_CODE_I |      1 |   2000 |       |     4   (0)| 00:00:01 |   2000 |00:00:00.01 |      16 |
-----------------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$1 / ZIGGY2@SEL$1
   2 - SEL$1 / ZIGGY2@SEL$1
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("CODE"=42)

--//oracle執行計劃選擇索引範圍掃描。

5.對於ziggy表可以修改屬性,重新組織表:

SCOTT@test01p> alter table ziggy add clustering by linear order(code) without materialized zonemap;
Table altered.

SCOTT@test01p> alter table ziggy move;
Table altered.

SCOTT@test01p> alter index ziggy_code_i rebuild;
Index altered.

SCOTT@test01p> select index_name, clustering_factor, num_rows from user_indexes where index_name='ZIGGY_CODE_I';
INDEX_NAME           CLUSTERING_FACTOR   NUM_ROWS
-------------------- ----------------- ----------
ZIGGY_CODE_I                       691     200000

--//這樣執行select * from ziggy where code=42;必然選擇索引,不貼出執行計劃了。

6.總結:
--//我覺得應該很少有人使用這個特性,總覺得這樣僅僅減少手工操作的麻煩.比如你想手工按照code匯入.也許操作需要許多步驟.

--//補充測試:
SCOTT@test01p> select table_name,clustering from dba_tables where owner=user and table_name in ('ZIGGY','ZIGGY2','DEPT');
TABLE_NAME           CLU
-------------------- ---
DEPT                 NO
ZIGGY2               YES
ZIGGY                YES

SCOTT@test01p> @ddl scott.ZIGGY
C100
------------------------------------------------------------------------
  CREATE TABLE "SCOTT"."ZIGGY"
   (    "ID" NUMBER,
        "CODE" NUMBER,
        "NAME" VARCHAR2(30)
   ) SEGMENT CREATION IMMEDIATE
  PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
 NOCOMPRESS LOGGING
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "USERS"
 CLUSTERING
 BY LINEAR ORDER ("SCOTT"."ZIGGY"."CODE")
   YES ON LOAD  YES ON DATA MOVEMENT
 WITHOUT MATERIALIZED ZONEMAP ;

--//不知道那個表記錄 CLUSTERING BY LINEAR ORDER ("SCOTT"."ZIGGY"."CODE")資訊。先暫時放一下。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/267265/viewspace-2217739/,如需轉載,請註明出處,否則將追究法律責任。

相關文章