MYSQL INNODB主鍵使用varchar和int的區別

gaopengtttt發表於2017-07-11
原創,如果有誤請指出

今天同事討論關於主鍵使用varchar和int的區別。

我現在總結的3個問題:
1、tablespace中空間浪費
   當然我們知道使用varchar可能會導致輔助索引比較大,因為用到varchar可能儲存的字元較多,同時
   在行頭也存在一個可變欄位字元區域(1-2)位元組
   而輔助索引葉子結點畢竟都儲存了主鍵值,這樣至少會多varchar資料位元組數量+1(或者2) 位元組- 4(int)位元組空間。
   如果輔助索引比較多空間浪費是可想而知的。
2、輔助索引B+樹掃描效能
    由於輔助索引B+樹的空間要求更大,雖然在B+樹層次一般都是3層-4層,索引單值定位I/O消耗並不明顯,如果涉及到
    範圍查詢(比如PAGE_CUR_G),需要訪問的塊就更多,同時比如例如輔助索引的using index,需要訪問的塊自然
    更多
3、比較更加複雜
   innodb 在進行元組比較的時候,不管是DML,select都會涉及到元組的比較,同時回表的時候也涉及
   到比較操作。而varchar型別的比較比int型別更為複雜一些。
那麼我們就來分析第三個問題,第一個問題和第二個問題是顯而易見的。
我這裡資料庫字符集為latin1\latin1_swedish_ci

其實在innodb底層進行比較的時候都呼叫cmp_data這個函式
在innodb中有自己的定義的資料型別如下:

點選(此處)摺疊或開啟

  1. /*-------------------------------------------*/
  2. /* The 'MAIN TYPE' of a column */
  3. #define DATA_MISSING    0    /* missing column */
  4. #define    DATA_VARCHAR    1    /* character varying of the
  5. latin1_swedish_ci charset-collation; note
  6. that the MySQL format for this, DATA_BINARY,
  7. DATA_VARMYSQL, is also affected by whether the
  8. 'precise type' contains
  9. DATA_MYSQL_TRUE_VARCHAR */
  10. #define DATA_CHAR    2    /* fixed length character of the
  11. latin1_swedish_ci charset-collation */
  12. #define DATA_FIXBINARY    3    /* binary string of fixed length */
  13. #define DATA_BINARY    4    /* binary string */
  14. #define DATA_BLOB    5    /* binary large object, or a TEXT type;
  15. if prtype & DATA_BINARY_TYPE == 0, then this is
  16. actually a TEXT column (or a BLOB created
  17. with < 4.0.14; since column prefix indexes
  18. came only in 4.0.14, the missing flag in BLOBs
  19. created before that does not cause any harm) */
  20. #define    DATA_INT    6    /* integer: can be any size 1 - 8 bytes */
  21. #define    DATA_SYS_CHILD    7    /* address of the child page in node pointer */
  22. #define    DATA_SYS    8    /* system column */

我們熟悉的int型別屬於DATA_INT而varchar屬於DATA_VARCHAR,rowid屬於DATA_SYS
在函式cmp_data根據各種型別的不同進行了不同比較的方式,這裡就將int和varchar
判斷的方式進行說明:
1、innodb int型別比較
實際上是在cmp_data中進行了大概的方式如下

點選(此處)摺疊或開啟

  1. if (len) {
  2. #if defined __i386__ || defined __x86_64__ || defined _M_IX86 || defined _M_X64
  3. /* Compare the first bytes with a loop to avoid the call
  4. overhead of memcmp(). On x86 and x86-64, the GCC built-in
  5. (repz cmpsb) seems to be very slow, so we will be calling the
  6. libc version. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
  7. tracks the slowness of the GCC built-in memcmp().


  8. We compare up to the first 4..7 bytes with the loop.
  9. The (len & 3) is used for "normalizing" or
  10. "quantizing" the len parameter for the memcmp() call,
  11. in case the whole prefix is equal. On x86 and x86-64,
  12. the GNU libc memcmp() of equal strings is faster with
  13. len=4 than with len=3.


  14. On other architectures than the IA32 or AMD64, there could
  15. be a built-in memcmp() that is faster than the loop.
  16. We only use the loop where we know that it can improve
  17. the performance. */
  18. for (ulint i = 4 + (len & 3); i > 0; i--) {
  19. cmp = int(*data1++) - int(*data2++);
  20. if (cmp) {
  21. return(cmp);
  22. }


  23. if (!--len) {
  24. break;
  25. }
  26. }
  27. my_strnncollsp_simple


  28. if (len) {
  29. #endif /* IA32 or AMD64 */
  30. cmp = memcmp(data1, data2, len);


  31. if (cmp) {
  32. return(cmp);
  33. }


  34. data1 += len;
  35. data2 += len;
  36. #if defined __i386__ || defined __x86_64__ || defined _M_IX86 || defined _M_X64
  37. }
  38. #endif /* IA32 or AMD64 */
  39. }


  40. cmp = (int) (len1 - len2);


  41. if (!cmp || pad == ULINT_UNDEFINED) {
  42. return(cmp);
  43. }
可以看到整個方式比較簡潔,對於我們常用的x86_64模型並沒有直接使用memcpy進行而是
進行了最佳化在註釋中也有說明,才出現了for (ulint i = 4 + (len & 3); i > 0; i--)
部分,如果是IA32 or AMD64則直接使用memcpy進行比較。感興趣的可以仔細閱讀一下

2、innodb varchar型別比較
實際上這個比較會透過cmp_data->cmp_whole_field->my_strnncollsp_simple呼叫最終呼叫
my_strnncollsp_simple完成,而比如order by 會呼叫my_strnxfrm_simple他們都在一個
檔案中。
下面是整個my_strnncollsp_simple函式


點選(此處)摺疊或開啟

  1. /*
  2.   Compare strings, discarding end space


  3.   SYNOPSIS
  4.     my_strnncollsp_simple()
  5.     cs    character set handler
  6.     a    First string to compare
  7.     a_length    Length of 'a'
  8.     b    Second string to compare
  9.     b_length    Length of 'b'
  10.     diff_if_only_endspace_difference
  11.        Set to 1 if the strings should be regarded as different
  12.                         if they only difference in end space


  13.   IMPLEMENTATION
  14.     If one string is shorter as the other, then we space extend the other
  15.     so that the strings have equal length.


  16.     This will ensure that the following things hold:


  17.     "a" == "a "
  18.     "a\0" < "a"
  19.     "a\0" < "a "


  20.   RETURN
  21.     < 0    a < b
  22.     = 0    a == b
  23.     > 0    a > b
  24. */


  25. int my_strnncollsp_simple(const CHARSET_INFO *cs, const uchar *a,
  26.                           size_t a_length, const uchar *b, size_t b_length,
  27.                           my_bool diff_if_only_endspace_difference)
  28. {
  29.   const uchar *map= cs->sort_order, *end;
  30.   size_t length;
  31.   int res;


  32. #ifndef VARCHAR_WITH_DIFF_ENDSPACE_ARE_DIFFERENT_FOR_UNIQUE
  33.   diff_if_only_endspace_difference= 0;
  34. #endif


  35.   end= a + (length= MY_MIN(a_length, b_length));
  36.   while (a < end)
  37.   {
  38.     if (map[*a++] != map[*b++])
  39.       return ((int) map[a[-1]] - (int) map[b[-1]]);
  40.   }
  41.   res= 0;
  42.   if (a_length != b_length)
  43.   {
  44.     int swap= 1;
  45.     if (diff_if_only_endspace_difference)
  46.       res= 1; /* Assume 'a' is bigger */
  47.     /*
  48.       Check the next not space character of the longer key. If it's < ' ',
  49.       then it's smaller than the other key.
  50.     */
  51.     if (a_length < b_length)
  52.     {
  53.       /* put shorter key in s */
  54.       a_length= b_length;
  55.       a= b;
  56.       swap= -1; /* swap sign of result */
  57.       res= -res;
  58.     }
  59.     for (end= a + a_length-length; a < end ; a++)
  60.     {
  61.       if (map[*a] != map[' '])
  62. return (map[*a] < map[' ']) ? -swap : swap;
  63.     }
  64.   }
  65.   return res;
  66. }
其中*map= cs->sort_order比較關鍵這是記憶體中已經儲存好的字符集的順序,
迴圈進行
map[*a++] != map[*b++]
*a++和*b++ 會得到的字符集編碼,然後在整個排序好的字元陣列中找,
則得到了實際字符集編碼進行比較,不管是比較的複雜度還是需要比較的
長度 varchar很可能都遠遠大於int型別,下面是列印cs->sort_order這片
記憶體區域前128位元組得到的結果,
(gdb) x/128bx 0x258b000
0x258b000 :          0x00    0x01    0x02    0x03    0x04    0x05    0x06    0x07
0x258b008 :        0x08    0x09    0x0a    0x0b    0x0c    0x0d    0x0e    0x0f
0x258b010 :       0x10    0x11    0x12    0x13    0x14    0x15    0x16    0x17
0x258b018 :       0x18    0x19    0x1a    0x1b    0x1c    0x1d    0x1e    0x1f
0x258b020 :       0x20    0x21    0x22    0x23    0x24    0x25    0x26    0x27
0x258b028 :       0x28    0x29    0x2a    0x2b    0x2c    0x2d    0x2e    0x2f
0x258b030 :       0x30    0x31    0x32    0x33    0x34    0x35    0x36    0x37
0x258b038 :       0x38    0x39    0x3a    0x3b    0x3c    0x3d    0x3e    0x3f
0x258b040 :       0x40    0x41    0x42    0x43    0x44    0x45    0x46    0x47
0x258b048 :       0x48    0x49    0x4a    0x4b    0x4c    0x4d    0x4e    0x4f
0x258b050 :       0x50    0x51    0x52    0x53    0x54    0x55    0x56    0x57
0x258b058 :       0x58    0x59    0x5a    0x5b    0x5c    0x5d    0x5e    0x5f
0x258b060 :       0x60    0x41    0x42    0x43    0x44    0x45    0x46    0x47
0x258b068 :      0x48    0x49    0x4a    0x4b    0x4c    0x4d    0x4e    0x4f
0x258b070 :      0x50    0x51    0x52    0x53    0x54    0x55    0x56    0x57
0x258b078 :      0x58    0x59    0x5a    0x7b    0x7c    0x7d    0x7e    0x7f
而從記憶體的地址0x258b000我們也能看到他確實是存在於堆記憶體空間中,它是一片堆記憶體區域。

下面是varchar比較的呼叫棧幀以備後用

點選(此處)摺疊或開啟

  1. #0 my_strnncollsp_simple (cs=0x2d4b9c0, a=0x7fff57a71f93 "gaopeng", a_length=7, b=0x7fffbd7e807f "gaopeng", b_length=7, diff_if_only_endspace_difference=0 '\000')
  2.     at /root/mysql5.7.14/percona-server-5.7.14-7/strings/ctype-simple.c:165
  3. #1 0x0000000001ab8ec2 in cmp_whole_field (mtype=1, prtype=524303, a=0x7fff57a71f93 "gaopeng", a_length=7, b=0x7fffbd7e807f "gaopeng", b_length=7)
  4.     at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/rem/rem0cmp.cc:374
  5. #2 0x0000000001aba827 in cmp_data (mtype=1, prtype=524303, data1=0x7fff57a71f93 "gaopeng", len1=7, data2=0x7fffbd7e807f "gaopeng", len2=7)
  6.     at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/rem/rem0cmp.cc:468
  7. #3 0x0000000001ab9a05 in cmp_dtuple_rec_with_match_bytes (dtuple=0x7fff48ed3280, rec=0x7fffbd7e807f "gaopeng", index=0x7fff48ec78a0, offsets=0x7fff57a6bc50,
  8.     matched_fields=0x7fff57a6bf80, matched_bytes=0x7fff57a6bf78) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/rem/rem0cmp.cc:880
  9. #4 0x0000000001a87fe2 in page_cur_search_with_match_bytes (block=0x7fffbcceafc0, index=0x7fff48ec78a0, tuple=0x7fff48ed3280, mode=PAGE_CUR_GE,
  10.     iup_matched_fields=0x7fff57a6cdf8, iup_matched_bytes=0x7fff57a6cdf0, ilow_matched_fields=0x7fff57a6cde8, ilow_matched_bytes=0x7fff57a6cde0, cursor=0x7fff57a713f8)
  11.     at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/page/page0cur.cc:850
  12. #5 0x0000000001c17a3e in btr_cur_search_to_nth_level (index=0x7fff48ec78a0, level=0, tuple=0x7fff48ed3280, mode=PAGE_CUR_GE, latch_mode=1, cursor=0x7fff57a713f0,
  13.     has_search_latch=0, file=0x2336938 "/root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/btr/btr0cur.cc", line=5744, mtr=0x7fff57a70ee0)
  14.     at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/btr/btr0cur.cc:1478
  15. #6 0x0000000001c222bf in btr_estimate_n_rows_in_range_low (index=0x7fff48ec78a0, tuple1=0x7fff48ed3280, mode1=PAGE_CUR_GE, tuple2=0x7fff48ed32e0, mode2=PAGE_CUR_G,
  16.     nth_attempt=1) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/btr/btr0cur.cc:5744
  17. #7 0x0000000001c22a09 in btr_estimate_n_rows_in_range (index=0x7fff48ec78a0, tuple1=0x7fff48ed3280, mode1=PAGE_CUR_GE, tuple2=0x7fff48ed32e0, mode2=PAGE_CUR_G)
  18.     at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/btr/btr0cur.cc:6044
  19. #8 0x00000000019b3e0e in ha_innobase::records_in_range (this=0x7fff48e7e3b0, keynr=1, min_key=0x7fff57a71680, max_key=0x7fff57a716a0)
  20.     at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/handler/ha_innodb.cc:13938
  21. #9 0x0000000000f6ed5b in handler::multi_range_read_info_const (this=0x7fff48e7e3b0, keyno=1, seq=0x7fff57a71b90, seq_init_param=0x7fff57a71850, n_ranges_arg=0,
  22.     bufsz=0x7fff57a71780, flags=0x7fff57a71784, cost=0x7fff57a71d10) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/handler.cc:6440
  23. #10 0x0000000000f70662 in DsMrr_impl::dsmrr_info_const (this=0x7fff48e7e820, keyno=1, seq=0x7fff57a71b90, seq_init_param=0x7fff57a71850, n_ranges=0,
  24.     bufsz=0x7fff57a71d70, flags=0x7fff57a71d74, cost=0x7fff57a71d10) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/handler.cc:7112
  25. #11 0x00000000019be22f in ha_innobase::multi_range_read_info_const (this=0x7fff48e7e3b0, keyno=1, seq=0x7fff57a71b90, seq_init_param=0x7fff57a71850, n_ranges=0,
  26.     bufsz=0x7fff57a71d70, flags=0x7fff57a71d74, cost=0x7fff57a71d10) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/handler/ha_innodb.cc:21351
  27. #12 0x000000000178c9e4 in check_quick_select (param=0x7fff57a71e30, idx=0, index_only=false, tree=0x7fff48e700e0, update_tbl_stats=true, mrr_flags=0x7fff57a71d74,
  28.     bufsize=0x7fff57a71d70, cost=0x7fff57a71d10) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/opt_range.cc:10030
  29. #13 0x0000000001783305 in get_key_scans_params (param=0x7fff57a71e30, tree=0x7fff48e70058, index_read_must_be_used=false, update_tbl_stats=true,
  30.     cost_est=0x7fff57a74190) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/opt_range.cc:5812
  31. #14 0x000000000177ce43 in test_quick_select (thd=0x7fff4801f4d0, keys_to_use=..., prev_tables=0, limit=18446744073709551615, force_quick_range=false,
  32.     interesting_order=st_order::ORDER_NOT_RELEVANT, tab=0x7fff48eacf20, cond=0x7fff48eacd50, needed_reg=0x7fff48eacf60, quick=0x7fff57a744c8)
  33.     at /root/mysql5.7.14/percona-server-5.7.14-7/sql/opt_range.cc:3066
  34. #15 0x000000000158b9bc in get_quick_record_count (thd=0x7fff4801f4d0, tab=0x7fff48eacf20, limit=18446744073709551615)
  35.     at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_optimizer.cc:5942
  36. #16 0x000000000158b073 in JOIN::estimate_rowcount (this=0x7fff48eac980) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_optimizer.cc:5689
  37. #17 0x00000000015893b5 in JOIN::make_join_plan (this=0x7fff48eac980) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_optimizer.cc:5046
  38. #18 0x000000000157d9b7 in JOIN::optimize (this=0x7fff48eac980) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_optimizer.cc:387
  39. #19 0x00000000015fab71 in st_select_lex::optimize (this=0x7fff48aa45c0, thd=0x7fff4801f4d0) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_select.cc:1009
  40. #20 0x00000000015f9284 in handle_query (thd=0x7fff4801f4d0, lex=0x7fff48021ab0, result=0x7fff48aa5dc8, added_options=0, removed_options=0)
  41.     at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_select.cc:164
  42. #21 0x00000000015ac159 in execute_sqlcom_select (thd=0x7fff4801f4d0, all_tables=0x7fff48aa54b8) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:5391
  43. #22 0x00000000015a4774 in mysql_execute_command (thd=0x7fff4801f4d0, first_level=true) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:2889
  44. #23 0x00000000015ad12a in mysql_parse (thd=0x7fff4801f4d0, parser_state=0x7fff57a76600) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:5836
  45. #24 0x00000000015a0fe9 in dispatch_command (thd=0x7fff4801f4d0, com_data=0x7fff57a76d70, command=COM_QUERY)
  46.     at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:1447
  47. #25 0x000000000159fe1a in do_command (thd=0x7fff4801f4d0) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:1010
  48. #26 0x00000000016e1d6c in handle_connection (arg=0x6320740) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/conn_handler/connection_handler_per_thread.cc:312
  49. ---Type <return> to continue, or q <return> to quit---
  50. #27 0x0000000001d723f4 in pfs_spawn_thread (arg=0x6320530) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/perfschema/pfs.cc:2188
  51. #28 0x0000003ca62079d1 in start_thread () from /lib64/libpthread.so.0
  52. #29 0x0000003ca5ee8b6d in clone () from /lib64/libc.so.6

作者微信:

MYSQL INNODB主鍵使用varchar和int的區別

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/7728585/viewspace-2141914/,如需轉載,請註明出處,否則將追究法律責任。

相關文章