MYSQL merge union merge sort_union 的不同

gaopengtttt發表於2016-11-20
今天看到MYSQL手冊的Index Merge Optimization,不禁有一些想法,所以記錄如下文

先來解釋下2種方式不同:
這兩種方式都使用一個table中的不同二級索引進行,注意是單個表。
merge union :在使用or的時候如果二級索引包含了所有的key part,那麼就可以得到排序好的聚集索引的鍵值或者ROWID,那麼簡單的union 去重就可以了,不需要額外的排序
             原始碼介面quick_ror_union_select類
merge sort_union :和上面的不同的是沒有包含二級索引所有的key part,那麼要首先要獲得排序好的聚集索引鍵值或者ROWID,才能對聚集索引鍵值或者ROWID進行union操作
                  原始碼介面quick_index_merge_select
參考手冊:9.2.1.4 Index Merge Optimization
總的來說只要mysql 不能確定主鍵是排序好的方式就需要額外的排序操作。


如果我們對merge sort演算法有一定了解,可以看到這樣的處理是必須的,
我們知道在進行歸併的時候所有的需要歸併的子集是需要排序好的,下面是一個簡單的歸併演算法的圖解:


如果我們把 1 2 5 9 和 3 4 7 8看成primary key 那麼他們就是要排序好才能完成最後的歸併,
當然上層的排序操作可以歸併也可以用其他排序方式,只要排序好就可以,另外提一點,歸併
排序熟悉資料結構的朋友應該知道他也是外部磁碟排序的一種好方式。

這裡要理解我們需要對組合索引在INNODB B+樹頁塊的排列有一個瞭解:
比如:seq int,id1 int,id2 int  seq是主鍵,ID1,DI2是一個組合B+索引
那麼我們插入值
values(1,1,2)
values(2,1,3)
values(3,1,2)

顯然在組合索引的葉節點排列順序如下:

1       2       3
id1:1  id1:1  id1:1
id2:2  id2:2  id2:3
seq:1  seq:3  seq:2

也就是先按照id1進行排序然後按照id2排序最後按照主鍵seq排序.
那麼可以看到最後主鍵的順序為 1 3 2並不是有序的,很明顯這樣的
結果集不能作為歸併的結果集,那麼我們就需要進行排序,這也是為什麼
sort_union sort的來源。

那麼下面來演示2種執行計劃的不同
指令碼:
create table testmer
(seq int,id1 int,id2 int,id3 int,id4 int,primary key(seq),key(id1,id2),key(id3,id4));

insert into testmer values(1,1,2,4,4);
insert into testmer values(2,1,3,4,5);
insert into testmer values(3,1,2,4,4);
insert into testmer values(4,2,4,5,6);
insert into testmer values(5,2,6,5,8);
insert into testmer values(6,2,10,5,3);
insert into testmer values(7,4,5,8,10);
insert into testmer values(8,0,1,3,4);

mysql> select * from testmer;
+-----+------+------+------+------+
| seq | id1  | id2  | id3  | id4  |
+-----+------+------+------+------+
|   1 |    1 |    2 |    4 |    4 |
|   2 |    1 |    3 |    4 |    5 |
|   3 |    1 |    2 |    4 |    4 |
|   4 |    2 |    4 |    5 |    6 |
|   5 |    2 |    6 |    5 |    8 |
|   6 |    2 |   10 |    5 |    3 |
|   7 |    4 |    5 |    8 |   10 |
|   8 |    0 |    1 |    3 |    4 |
+-----+------+------+------+------+

Using sort_union:
mysql> explain  select * from testmer force index(id1,id3) where id1=1 or id3=4;
+----+-------------+---------+------------+-------------+---------------+---------+---------+------+------+----------+----------------------------------------+
| id | select_type | table   | partitions | type        | possible_keys | key     | key_len | ref  | rows | filtered | Extra                                  |
+----+-------------+---------+------------+-------------+---------------+---------+---------+------+------+----------+----------------------------------------+
|  1 | SIMPLE      | testmer | NULL       | index_merge | id1,id3       | id1,id3 | 5,5     | NULL |    6 |   100.00 | Using sort_union(id1,id3); Using where |
+----+-------------+---------+------------+-------------+---------------+---------+---------+------+------+----------+----------------------------------------+
1 row in set, 1 warning (5.07 sec)

很明顯這裡只看key(id1,id2) 就需要排序了,因為排列如下:
1       2       3
id1:1  id1:1   id1:1
id2:2  id2:2   id2:3
seq:1  seq:3  seq:2

如果我們把二級索引KEY_PART帶全
mysql> explain  select * from testmer force index(id1,id3) where id1=1 and id2=2 or id3=4 and id4=1;
+----+-------------+---------+------------+-------------+---------------+---------+---------+------+------+----------+-----------------------------------+
| id | select_type | table   | partitions | type        | possible_keys | key     | key_len | ref  | rows | filtered | Extra                             |
+----+-------------+---------+------------+-------------+---------------+---------+---------+------+------+----------+-----------------------------------+
|  1 | SIMPLE      | testmer | NULL       | index_merge | id1,id3       | id1,id3 | 10,10   | NULL |    2 |   100.00 | Using union(id1,id3); Using where |
+----+-------------+---------+------------+-------------+---------------+---------+---------+------+------+----------+-----------------------------------+

這裡當然不需要排序我們看id1=1 and id2=2(id3=4 and id4=1 也是一樣)
排列如下:
1         2      
id1:1   id1:1   
id2:2   id2:2 
seq:1  seq:3
也就說如果KEY_PART包含完整那麼主鍵自然排序好的結果,

其實我是在DEBUG環境下跑的,斷點打在了Unique::unique_add
(gdb) info b
Num     Type           Disp Enb Address            What
1       breakpoint     keep y   0x0000000000ebd333 in main(int, char**) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/main.cc:25
        breakpoint already hit 1 time
6       breakpoint     keep y   0x000000000145de13 in Unique::unique_add(void*) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/uniques.h:52
        breakpoint already hit 2 times

在執行select * from testmer force index(id1,id3) where id1=1 and id2=1 or id3=4 and id4=1;
沒有觸發Unique::unique_add,也就是沒有進行排序操作。

最後說明下原始碼的merge_sort 排序的介面
QUICK_INDEX_MERGE_SELECT::read_keys_and_merge()
呼叫
Unique::unique_add
(使用balanced binary trees,平衡二叉樹非紅黑樹區別參考:
http://blog.itpub.net/7728585/viewspace-2127419/
)

下面是原始碼read_keys_and_merge()的註釋:
/*
  Perform key scans for all used indexes (except CPK), get rowids and merge 
  them into an ordered non-recurrent sequence of rowids.
  
  The merge/duplicate removal is performed using Unique class. We put all
  rowids into Unique, get the sorted sequence and destroy the Unique.
  
  If table has a clustered primary key that covers all rows (TRUE for bdb
  and innodb currently) and one of the index_merge scans is a scan on PK,
  then rows that will be retrieved by PK scan are not put into Unique and 
  primary key scan is not performed here, it is performed later separately.


  RETURN
    0     OK
    other error
*/


下面是我gdb時候的堆疊資訊:
(gdb) bt
#0  tree_insert (tree=0x7fffd801c768, key=0x7fffd801ada0, key_size=0, custom_arg=0x7fffd80103d0) at /root/mysql5.7.14/percona-server-5.7.14-7/mysys/tree.c:207
#1  0x000000000145df19 in Unique::unique_add (this=0x7fffd801c260, ptr=0x7fffd801ada0) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/uniques.h:56
#2  0x000000000178e6a8 in QUICK_INDEX_MERGE_SELECT::read_keys_and_merge (this=0x7fffd89083f0) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/opt_range.cc:10700
#3  0x0000000001778c73 in QUICK_INDEX_MERGE_SELECT::reset (this=0x7fffd89083f0) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/opt_range.cc:1601
#4  0x000000000155e529 in join_init_read_record (tab=0x7fffd8906e20) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_executor.cc:2471
#5  0x000000000155b6a1 in sub_select (join=0x7fffd8905b08, qep_tab=0x7fffd8906e20, end_of_records=false)
    at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_executor.cc:1271
#6  0x000000000155b026 in do_select (join=0x7fffd8905b08) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_executor.cc:944
#7  0x0000000001558efc in JOIN::exec (this=0x7fffd8905b08) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_executor.cc:199
#8  0x00000000015f91c6 in handle_query (thd=0x7fffd8000df0, lex=0x7fffd80033d0, result=0x7fffd8007a60, added_options=0, removed_options=0)
    at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_select.cc:184
#9  0x00000000015ac025 in execute_sqlcom_select (thd=0x7fffd8000df0, all_tables=0x7fffd8006e98) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:5391
#10 0x00000000015a4640 in mysql_execute_command (thd=0x7fffd8000df0, first_level=true) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:2889
#11 0x00000000015acff6 in mysql_parse (thd=0x7fffd8000df0, parser_state=0x7ffff0fd6600) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:5836
#12 0x00000000015a0eb5 in dispatch_command (thd=0x7fffd8000df0, com_data=0x7ffff0fd6d70, command=COM_QUERY)
    at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:1447
#13 0x000000000159fce6 in do_command (thd=0x7fffd8000df0) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:1010
#14 0x00000000016e1c08 in handle_connection (arg=0x3c1c880) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/conn_handler/connection_handler_per_thread.cc:312
#15 0x0000000001d71ed0 in pfs_spawn_thread (arg=0x3bec1b0) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/perfschema/pfs.cc:2188
#16 0x0000003ca62079d1 in start_thread () from /lib64/libpthread.so.0
#17 0x0000003ca5ee8b6d in clone () from /lib64/libc.so.6

再附上2種方式函式介面呼叫情況:
merge sort_union:

T@3: | | | | | | | | | | >QUICK_INDEX_MERGE_SELECT::QUICK_INDEX_MERGE_SELECT
T@3: | | | | | | | | | | <QUICK_INDEX_MERGE_SELECT::QUICK_INDEX_MERGE_SELECT 1589
T@3: | | | | | | | | | | >QUICK_INDEX_MERGE_SELECT::init
T@3: | | | | | | | | | | <QUICK_INDEX_MERGE_SELECT::init 1595
T@3: | | | | | | | | >QUICK_INDEX_MERGE_SELECT::reset
T@3: | | | | | | | | | >QUICK_INDEX_MERGE_SELECT::read_keys_and_merge
T@3: | | | | | | | | | <QUICK_INDEX_MERGE_SELECT::read_keys_and_merge 10716
T@3: | | | | | | | | <QUICK_INDEX_MERGE_SELECT::reset 1602
T@3: | | | | | | | | >QUICK_INDEX_MERGE_SELECT::get_next
T@3: | | | | | | | | <QUICK_INDEX_MERGE_SELECT::get_next 10753
T@3: | | | | | | | | >QUICK_INDEX_MERGE_SELECT::get_next
T@3: | | | | | | | | <QUICK_INDEX_MERGE_SELECT::get_next 10753
T@3: | | | | | | | | >QUICK_INDEX_MERGE_SELECT::get_next
T@3: | | | | | | | | <QUICK_INDEX_MERGE_SELECT::get_next 10753
T@3: | | | | | | | | >QUICK_INDEX_MERGE_SELECT::get_next
T@3: | | | | | | | | <QUICK_INDEX_MERGE_SELECT::get_next 10753
T@3: | | | | | | | >QUICK_INDEX_MERGE_SELECT::~QUICK_INDEX_MERGE_SELECT
T@3: | | | | | | | <QUICK_INDEX_MERGE_SELECT::~QUICK_INDEX_MERGE_SELECT 1635


merge union:

T@3: | | | | | | | | | | >QUICK_ROR_UNION_SELECT::init
T@3: | | | | | | | | | | <QUICK_ROR_UNION_SELECT::init 1942
T@3: | | | | | | | | >QUICK_ROR_UNION_SELECT::reset
T@3: | | | | | | | | <QUICK_ROR_UNION_SELECT::reset 2004
T@3: | | | | | | | | >QUICK_ROR_UNION_SELECT::get_next
T@3: | | | | | | | | <QUICK_ROR_UNION_SELECT::get_next 10948
T@3: | | | | | | | | >QUICK_ROR_UNION_SELECT::get_next
T@3: | | | | | | | | <QUICK_ROR_UNION_SELECT::get_next 10948
T@3: | | | | | | | | >QUICK_ROR_UNION_SELECT::get_next
T@3: | | | | | | | | <QUICK_ROR_UNION_SELECT::get_next 10913
T@3: | | | | | | | >QUICK_ROR_UNION_SELECT::~QUICK_ROR_UNION_SELECT
T@3: | | | | | | | <QUICK_ROR_UNION_SELECT::~QUICK_ROR_UNION_SELECT 2021



可以看到呼叫路徑,檢視原始碼呼叫情況我只是想證明確實進行了排序,然後看看使用的什麼方式排序.
本文只代表個人觀點,如果有誤起提示。謝謝!

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/7728585/viewspace-2128759/,如需轉載,請註明出處,否則將追究法律責任。

相關文章