MySQL · 最佳實踐 · 如何索引 JSON 欄位 (阿里雲 RDS-資料庫核心組 )

cyd622發表於2019-04-11

概述

MySQL從5.7.8起開始支援JSON欄位,這極大的豐富了MySQL的資料型別。也方便了廣大開發人員。但MySQL並沒有提供對JSON物件中的欄位進行索引的功能,至少沒有直接對其欄位進行索引的方法。本文將介紹利用MySQL 5.7中的虛擬欄位的功能來對JSON物件中的欄位進行索引。

示例資料

我們將基於下面的JSON物件進行演示

{
    "id": 1,  
    "name": "Sally",  
    "games_played":{    
       "Battlefield": {
          "weapon": "sniper rifle",
          "rank": "Sergeant V",
          "level": 20
        },                                                                                                                          
       "Crazy Tennis": {
          "won": 4,
          "lost": 1
        },  
       "Puzzler": {
          "time": 7
        }
     }
 }

表的基本結構

CREATE TABLE `players` (  
    `id` INT UNSIGNED NOT NULL,
    `player_and_games` JSON NOT NULL,
    PRIMARY KEY (`id`)
);

如果只是基於上面的表的結構我們是無法對JSON欄位中的Key進行索引的。接下來我們演示如何藉助虛擬欄位對其進行索引

增加虛擬欄位

虛擬列語法如下

<type> [ GENERATED ALWAYS ] AS ( <expression> ) [ VIRTUAL|STORED ]
[ UNIQUE [KEY] ] [ [PRIMARY] KEY ] [ NOT NULL ] [ COMMENT <text> ]

在MySQL 5.7中,支援兩種Generated Column,即Virtual Generated ColumnStored Generated Column,前者只將Generated Column儲存在資料字典中(表的後設資料),並不會將這一列資料持久化到磁碟上;後者會將Generated Column持久化到磁碟上,而不是每次讀取的時候計算所得。很明顯,後者存放了可以透過已有資料計算而得的資料,需要更多的磁碟空間,與Virtual Column相比並沒有優勢,因此,MySQL 5.7中,不指定Generated Column的型別,預設是Virtual Column。

如果需要Stored Generated Golumn的話,可能在Virtual Generated Column上建立索引更加合適,一般情況下,都使用Virtual Generated Column,這也是MySQL預設的方式

加完虛擬列的建表語句如下:

CREATE TABLE `players` (  
   `id` INT UNSIGNED NOT NULL,
   `player_and_games` JSON NOT NULL,
   `names_virtual` VARCHAR(20) GENERATED ALWAYS AS (`player_and_games` ->> '$.name') NOT NULL, 
   PRIMARY KEY (`id`)
);

Note: 利用運算子-» 來引用JSON欄位中的KEY。在本例中欄位names_virtual為虛擬欄位,我把它定義成不可以為空。在實際的工作中,一定要集合具體的情況來定。因為JSON本身是一種弱結構的資料物件。也就是說的它的結構不是固定不變的。

我們插入資料

INSERT INTO `players` (`id`, `player_and_games`) VALUES (1, '{  
    "id": 1,  
    "name": "Sally",
    "games_played":{    
       "Battlefield": {
          "weapon": "sniper rifle",
          "rank": "Sergeant V",
          "level": 20
        },                                                                                                                          
       "Crazy Tennis": {
          "won": 4,
          "lost": 1
        },  
       "Puzzler": {
          "time": 7
        }
      }
   }'
);

檢視錶裡的資料

SELECT * FROM `players`;

+----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+
| id | player_and_games                                                                                                                                                                                           | names_virtual |
+----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+
|  1 | {"id": 1, "name": "Sally", "games_played": {"Puzzler": {"time": 7}, "Battlefield": {"rank": "Sergeant V", "level": 20, "weapon": "sniper rifle"}, "Crazy Tennis": {"won": 4, "lost": 1}}}                  | Sally         |
|  2 | {"id": 2, "name": "Thom", "games_played": {"Puzzler": {"time": 25}, "Battlefield": {"rank": "Major General VIII", "level": 127, "weapon": "carbine"}, "Crazy Tennis": {"won": 10, "lost": 30}}}            | Thom          |
|  3 | {"id": 3, "name": "Ali", "games_played": {"Puzzler": {"time": 12}, "Battlefield": {"rank": "First Sergeant II", "level": 37, "weapon": "machine gun"}, "Crazy Tennis": {"won": 30, "lost": 21}}}           | Ali           |
|  4 | {"id": 4, "name": "Alfred", "games_played": {"Puzzler": {"time": 10}, "Battlefield": {"rank": "Chief Warrant Officer Five III", "level": 73, "weapon": "pistol"}, "Crazy Tennis": {"won": 47, "lost": 2}}} | Alfred        |
|  5 | {"id": 5, "name": "Phil", "games_played": {"Puzzler": {"time": 7}, "Battlefield": {"rank": "Lt. Colonel III", "level": 98, "weapon": "assault rifle"}, "Crazy Tennis": {"won": 130, "lost": 75}}}          | Phil          |
|  6 | {"id": 6, "name": "Henry", "games_played": {"Puzzler": {"time": 17}, "Battlefield": {"rank": "Captain II", "level": 87, "weapon": "assault rifle"}, "Crazy Tennis": {"won": 68, "lost": 149}}}             | Henry         |
+----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+

檢視錶Players的欄位

SHOW COLUMNS FROM `players`;

+------------------+------------------+------+-----+---------+-------------------+
| Field            | Type             | Null | Key | Default | Extra             |
+------------------+------------------+------+-----+---------+-------------------+
| id               | int(10) unsigned | NO   | PRI | NULL    |                   |
| player_and_games | json             | NO   |     | NULL    |                   |
| names_virtual    | varchar(20)      | NO   |     | NULL    | VIRTUAL GENERATED |
+------------------+------------------+------+-----+---------+-------------------+

我們看到虛擬欄位names_virtual的型別是VIRTUAL GENERATED。MySQL只是在資料字典裡儲存該欄位後設資料,並沒有真正的儲存該欄位的值。這樣表的大小並沒有增加。我們可以利用索引把這個欄位上的值進行物理儲存。

在虛擬欄位上加索引

再新增索引之前,讓我們先看下面查詢的執行計劃

EXPLAIN SELECT * FROM `players` WHERE `names_virtual` = "Sally"\G  
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: players
   partitions: NULL
         type: ALL
possible_keys: NULL  
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 6
     filtered: 16.67
        Extra: Using where

新增索引

CREATE INDEX `names_idx` ON `players`(`names_virtual`);  

再執行上面的查詢語句,我們將得到不一樣的執行計劃

EXPLAIN SELECT * FROM `players` WHERE `names_virtual` = "Sally"\G  
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: players
   partitions: NULL
         type: ref
possible_keys: names_idx  
          key: names_idx
      key_len: 22
          ref: const
         rows: 1
     filtered: 100.00
        Extra: NULL

如我們所見,最新的執行計劃走了新建的索引。

小結

本文介紹瞭如何在MySQL 5.7中儲存JSON文件。為了高效的檢索JSON中內容,我們可以利用5.7的虛擬欄位來對JSON的不同的KEY來建索引。極大的提高檢索的速度。


本文轉自 阿里雲RDS-資料庫核心組
再一次感謝您花費時間閱讀,祝您在這裡記錄、閱讀、分享愉快!

本作品採用《CC 協議》,轉載必須註明作者和本文連結

相關文章