Mysql: 利用強制索引去掉重資料

呆呆笨笨的魚發表於2015-01-21
Mysql: 利用強制索引去掉重資料


目的:把a b表的資料去掉重複的資料後放到c表中

資料庫版本:
[root@mysqltest ~]# mysql -u root -p123456
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 389805
Server version: 5.1.73-community MySQL Community Server (GPL)

Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>
mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| cacti              |
| centreon           |
| centreon_status    |
| centreon_storage   |
| mysql              |
| syslog             |
| test               |
+--------------------+
8 rows in set (0.01 sec)



1 .建立實驗使用的a b表
mysql> use test;
Database changed
mysql> show tables;
Empty set (0.00 sec)

mysql> create table a (a1 char(10), a2 char(10), a3 char(10));
Query OK, 0 rows affected (0.08 sec)

mysql> insert into a values ('1', '2', '3');
Query OK, 1 row affected (0.00 sec)

mysql> insert into a select * from a;
Query OK, 1 row affected (0.00 sec)
Records: 1  Duplicates: 0  Warnings: 0





mysql> insert into a select * from a;
Query OK, 4194304 rows affected (7.85 sec)
Records: 4194304  Duplicates: 0  Warnings: 0

mysql> insert into a select * from a;
Query OK, 8388608 rows affected (27.81 sec)
Records: 8388608  Duplicates: 0  Warnings: 0

2  建立b表:

mysql> create table b (b1 char (10), b2 char(10), b3 char(10));
Query OK, 0 rows affected (0.06 sec)

mysql> insert into b select * from a;
Query OK, 16777216 rows affected (1 min 6.18 sec)
Records: 16777216  Duplicates: 0  Warnings: 0

mysql>
mysql>
mysql> commit;
Query OK, 0 rows affected (0.00 sec)

mysql>
mysql>
mysql> select * from b limit 10;
+------+------+------+
| b1   | b2   | b3   |
+------+------+------+
| 1    | 2    | 3    |
| 1    | 2    | 3    |
| 1    | 2    | 3    |
| 1    | 2    | 3    |
| 1    | 2    | 3    |
| 1    | 2    | 3    |
| 1    | 2    | 3    |
| 1    | 2    | 3    |
| 1    | 2    | 3    |
| 1    | 2    | 3    |
+------+------+------+
10 rows in set (0.04 sec)

3  加入b表不同樣的資料

mysql> insert into b values ('4','5','6');
Query OK, 1 row affected (0.01 sec)

mysql> insert into b values ('4','5','6');
Query OK, 1 row affected (0.00 sec)

mysql> insert into b values ('4','5','6');
Query OK, 1 row affected (0.00 sec)

mysql> insert into b values ('4','5','6');
Query OK, 1 row affected (0.00 sec)

mysql>
mysql> insert into b values ('4','5','6');
Query OK, 1 row affected (0.00 sec)


mysql> commit
    -> ;
Query OK, 0 rows affected (0.00 sec)

mysql>
mysql>
mysql>


4 檢視a b表資料行數

mysql> select count(1) from b;
+----------+
| count(1) |
+----------+
| 16777224 |
+----------+
1 row in set (0.00 sec)

mysql> select count(1) from a;
+----------+
| count(1) |
+----------+
| 16777216 |
+----------+
1 row in set (0.00 sec)

5 建立c表

mysql> create table c (c1 char (10), c2 char(10), c3 char(10));
Query OK, 0 rows affected (0.31 sec)



6 建立臨時temp表

mysql> create table temp select * from c where 1=2;
Query OK, 0 rows affected (0.06 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> select * from temp;
Empty set (0.00 sec)

mysql> desc temp;
+-------+----------+------+-----+---------+-------+
| Field | Type     | Null | Key | Default | Extra |
+-------+----------+------+-----+---------+-------+
| c1    | char(10) | YES  |     | NULL    |       |
| c2    | char(10) | YES  |     | NULL    |       |
| c3    | char(10) | YES  |     | NULL    |       |
+-------+----------+------+-----+---------+-------+
3 rows in set (0.02 sec)



7 為b表建立索引
mysql> create index ind_b_b1 on b(b1);
Query OK, 16777224 rows affected (2 min 9.14 sec)
Records: 16777224  Duplicates: 0  Warnings: 0

mysql> desc b;
+-------+----------+------+-----+---------+-------+
| Field | Type     | Null | Key | Default | Extra |
+-------+----------+------+-----+---------+-------+
| b1    | char(10) | YES  | MUL | NULL    |       |
| b2    | char(10) | YES  |     | NULL    |       |
| b3    | char(10) | YES  |     | NULL    |       |
+-------+----------+------+-----+---------+-------+
3 rows in set (0.02 sec)


8 把a b表資料插入temp表中

mysql> insert into temp select * from a;
Query OK, 16777216 rows affected (29.84 sec)
Records: 16777216  Duplicates: 0  Warnings: 0

mysql> insert into temp select * from b;
Query OK, 16777224 rows affected (59.79 sec)
Records: 16777224  Duplicates: 0  Warnings: 0

mysql>
mysql>
mysql> commit;
Query OK, 0 rows affected (0.00 sec)

mysql> select count(1) from temp;
+----------+
| count(1) |
+----------+
| 33554440 |
+----------+
1 row in set (0.00 sec)



9 建立聯合索引      強制索引去掉重複資料

mysql> create index ind_temp_c123 on temp(c1, c2, c3);

Query OK, 33554440 rows affected (6 min 57.80 sec)
Records: 33554440  Duplicates: 0  Warnings: 0

mysql>
mysql>
mysql> explain select c1, c2, max(c3) from temp force index (ind_temp_c123) group by c1, c2;
+----+-------------+-------+-------+---------------+---------------+---------+------+------+--------------------------+
| id | select_type | table | type  | possible_keys | key           | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+---------------+---------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | temp  | range | NULL          | ind_temp_c123 | 22      | NULL |    3 | Using index for group-by |
+----+-------------+-------+-------+---------------+---------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

mysql> select count(*) from temp;
+----------+
| count(*) |
+----------+
| 33554440 |
+----------+
1 row in set (0.00 sec)

mysql> explain select c1, c2, c3 from temp force index (ind_temp_c123) group by c1, c2;
+----+-------------+-------+-------+---------------+---------------+---------+------+------+--------------------------+
| id | select_type | table | type  | possible_keys | key           | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+---------------+---------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | temp  | range | NULL          | ind_temp_c123 | 22      | NULL |    3 | Using index for group-by |
+----+-------------+-------+-------+---------------+---------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

mysql> insert into c select c1, c2, max(c3) from temp force index (ind_temp_c123) group by c1, c2;
Query OK, 2 rows affected (0.03 sec)
Records: 2  Duplicates: 0  Warnings: 0


10 去重複後c表的資料

mysql> select * from c;
+------+------+------+
| c1   | c2   | c3   |
+------+------+------+
| 1    | 2    | 3    |
| 4    | 5    | 6    |
+------+------+------+
2 rows in set (0.00 sec)
 
mysql>
mysql>
mysql> select * from temp order by c1 desc limit 10;
+------+------+------+
| c1   | c2   | c3   |
+------+------+------+
| 4    | 5    | 6    |
| 4    | 5    | 6    |
| 4    | 5    | 6    |
| 4    | 5    | 6    |
| 4    | 5    | 6    |
| 4    | 5    | 6    |
| 4    | 5    | 6    |
| 4    | 5    | 6    |
| 1    | 2    | 3    |
| 1    | 2    | 3    |
+------+------+------+
10 rows in set (0.00 sec)

mysql> Ctrl-C -- exit!
Aborted

11 刪除表 temp
 mysql> drop table temp;
Query OK, 0 rows affected (1.59 sec)

mysql> drop table a;
Query OK, 0 rows affected (0.55 sec)

mysql> drop table b;
Query OK, 0 rows affected (0.73 sec)

mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| c              |
+----------------+
1 row in set (0.00 sec)

mysql>

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/28929558/viewspace-1409011/,如需轉載,請註明出處,否則將追究法律責任。

相關文章