30分鐘SQL指南

shanyue發表於2019-03-22

原文網址 : https://juejin.im/post/5c94a2e1f265da610762f713

本篇文章是 SQL 必知必會 的讀書筆記，SQL必知必會的英文名叫做 Sams Teach Yourself in 10 Minutes 。但是，我肯定是不能夠在10分鐘就能學會本書所有涉及到的sql，所以就起個名字叫30分鐘學會SQL語句（其實半個小時也沒有學會...）。

目前手邊的資料庫是 mysql，所以以下示例均是由 mysql 演示。由於現在大部分工具都支援語法高亮，所以以下關鍵字都使用小寫。

原文連結見 shanyue.tech/post/sql-gu…

準備

工具

mycli，一個使用python編寫的終端工具，支援語法高亮，自動補全，多行模式，並且如果你熟悉vi的話，可以使用vi-mode快速移動，編輯。總之，vi + mycli 簡直是神器！

同樣，postgreSQL 可以使用pgcli。

pip install -U mycli    # 預設你已經安裝了pip
複製程式碼

樣例表

示例中有兩個表，分為 student 學生表與 class 班級表。student 表中有 class_id 關聯 class 表。以下是兩個表資料的 sql。另外，最後有三道小練習題會用到樣例表。

create table class (
  id int(11) not null auto_increment comment '班級id',
  name varchar(50) not null comment '班級名',
  primary key (id)
) comment '班級表';

create table student (
  id int(11) not null auto_increment comment '學生id',
  name varchar(50) not null comment '學生姓名',
  age tinyint unsigned default 20 comment '學生年齡',
  sex enum('male', 'famale') comment '性別',
  score tinyint comment '入學成績',
  class_id int(11) comment '班級',
  createTime timestamp default current_timestamp comment '建立時間',
  primary key (id),
  foreign key (class_id) references class (id)
) comment '學生表';

insert into class (name) values ('軟體工程'), ('市場營銷');

insert into student (name, age, sex, score, class_id) values ('張三', 21, 'male', 100, 1);
insert into student (name, age, sex, score, class_id) values ('李四', 22, 'male', 98, 1);
insert into student (name, age, sex, score, class_id) values ('王五', 22, 'male', 99, 1);
insert into student (name, age, sex, score, class_id) values ('燕七', 21, 'famale', 34, 2);
insert into student (name, age, sex, score, class_id) values ('林仙兒', 23, 'famale', 78, 2);
複製程式碼

SQL 基礎

術語

Database

資料庫值一系列有關聯資料的集合，而操作和管理這些資料的是DBMS，包括MySQL，PostgreSQL，MongoDB，Oracle，SQLite等等。 RDBMS 是基於關係模型的資料庫，使用 SQL 管理和操縱資料。另外也有一些 NoSQL 資料庫，比如 MongoDB。因為NoSQL為非關係型資料庫，一般不支援join操作，因此會有一些非正則化(denormalization)的資料，查詢也比較快。
Table

具有特定屬性的結構化檔案。比如學生表，學生屬性有學號，年齡，性別等。schema (模式) 用來描述這些資訊。 NoSQL 不需要固定列，一般沒有 schema，同時也利於垂直擴充套件。
Column

表中的特定屬性，如學生的學號，年齡。每一列都具有資料型別。
Data Type

每一列都具有資料型別，如 char, varchar，int，text，blob, datetime，timestamp。根據資料的粒度為列選擇合適的資料型別，避免無意義的空間浪費。如下有一些型別對比
- char, varchar 需要儲存資料的長度方差小的時候適合儲存char，否則varchar。 varchar 會使用額外長度儲存字串長度，佔用儲存空間較大。兩者對字串末尾的空格處理的策略不同，不同的DBMS又有不同的策略，設計資料庫的時候應當注意到這個區別。
- datetime, timestamp datetime 儲存時間範圍從1001年到9999年。 timestamp 儲存了自1970年1月1日的秒數，因為儲存範圍比較小，自然儲存空間佔用也比較小。日期型別可以設定更新行時自動更新日期，建議日期時間型別根據精度儲存為這兩個型別。如今 DBMS 能夠儲存微秒級別的精度，比如 mysql 預設儲存精度為秒，但可以指定到微秒級別，即小數點後六位小數
- enum 對於一些固定，不易變動的狀態碼建議儲存為 enum 型別，具有更好的可讀性，更小的儲存空間，並且可以保證資料有效性。
插一個小問題：如何儲存IP地址
Row

資料表的每一行記錄。如學生張三。

檢索資料

-- 檢索單列
select name from student;

-- 檢索多列
select name, age, class from student;

-- 檢索所有列
select * from student;

-- 對某列去重
select distinct class from student;

-- 檢索列-選擇區間
-- offset 基數為0，所以 `offset 1` 代表從第2行開始
select * from student limit 1, 10;
select * from student limit 10 offset 1;
複製程式碼

排序

預設排序是 ASC，所以一般升序的時候不需指定，降序的關鍵字是 DESC。使用 B-Tree 索引可以提高排序效能，但只限最左匹配。關於索引可以檢視以下 FAQ。

-- 根據學號降序排列
select * from student order by number desc;

-- 新增索引 (score, name) 可以提高排序效能
-- 但是索引 (name, score) 對效能毫無幫助，此謂最左匹配，可以根據 B+Tree 進行理解
select * from student order by score desc, name;
複製程式碼

資料過濾

資料篩選，或者資料過濾在 sql 中使用頻率最高

-- 找到學號為1的學生
select * from student where number = 1;

-- 找到學號為在 [1, 10] 的學生(閉區間)
select * from student where number between 1 and 10;

-- 找到未設定電子郵箱的學生
-- 注意不能使用 =
select * from student where email is null;

-- 找到一班中大於23歲的學生
select * from student where class_id = 1 and age > 23;

-- 找到一班或者大於23歲的學生
select * from student where class_id = 1 or age > 22;

-- 找到一班與二班的學生
select * from student where class_id in (1, 2);

-- 找到不是一班二班的學生
select * from student where class_id not in (1, 2);
複製程式碼

計算欄位

CONCAT

select concat(name, '(', age, ')') as nameWithAge from student;

select concat('hello', 'world') as helloworld;
複製程式碼

Math

select age - 18 as relativeAge from student;

select 3 * 4 as n;
複製程式碼

更多函式可以檢視 API 手冊，同時也可以自定義函式(User Define Function)。

可以直接使用 select 呼叫函式

select now();
select concat('hello', 'world');
複製程式碼

資料聚合 (aggregation)

聚合函式，一些對資料進行彙總的函式，常見有 COUNT，MIN，MAX，AVG，SUM 五種。

-- 統計1班人數
select count(*) from student where class_id = 1;
複製程式碼

資料分組

使用 group by 進行資料分組，可以使用聚合函式對分組資料進行彙總，使用 having 對分組資料進行篩選。

-- 按照班級進行分組並統計各班人數
select class_id, count(*) from student group by class_id;

-- 列出大於三個學生的班級
select class_id, count(*) as cnt from student group by class_id having cnt > 3;
複製程式碼

子查詢

-- 列出軟體工程班級中的學生
select * from student where class_id in (
  select id from class where name = '軟體工程'
);
複製程式碼

聯接

雖然兩個表擁有公共欄位便可以建立聯接，但是使用外來鍵可以更好地保證資料完整性。比如當對一個學生插入一條不存在的班級的時候，便會插入失敗。一般來說，聯接比子查詢擁有更好的效能。

-- 列出軟體工程班級中的學生
select * from student, class
where student.class_id = class.id and class.name = '軟體工程';
複製程式碼

內聯接

內聯接又叫等值聯接。

-- 列出軟體工程班級中的學生
select * from student
inner join class on student.class_id = class.id
where class.name = '軟體工程';
複製程式碼

自聯接

自連線就是相同的表進行聯接

-- 列出與張三同一班級的學生
select * from student s1
inner join student s2 on s1.class_id = s2.class_id
where s1.name = '張三';
複製程式碼

外聯接

外聯接分為 left join 與 right join，left join 指左側永不會為 null，right join 指右側永不會為 null。

-- 列出每個學生的班級，若沒有班級則為null
select name, class.name from student
left join class on student.class_id = class.id;
複製程式碼

插入資料

使用 insert into 向表中插入資料，也可以插入多行。

插入時可以不指定列名，不過嚴重依賴表中列的順序關係，推薦指定列名插入資料，並且可以插入部分列。

-- 插入一條資料
insert into student values(8, '陸小鳳', 24, 1, 3);

insert into student(name, age, sex, class_id) values(9, '花無缺', 25, 1, 3);
複製程式碼

修改資料

在修改重要資料時，務必先 select 確認是否需要運算元據，然後 begin 方便及時 rollback

更新

-- 修改張三的班級
update student set class_id = 2 where name = '張三';
複製程式碼

刪除

-- 刪除張三的資料
delete from student where name = '張三';

-- 刪除表中所有資料
delete from student;

-- 更快地刪除表中所有資料
truncate table student;
複製程式碼

建立表與更新表

-- 建立學生表，注意新增必要的註釋
create table student (
  id int(11) not null auto_increment comment '學生id',
  name varchar(50) not null comment '學生姓名',
  age tinyint unsigned default 20 comment '學生年齡',
  sex enum('male', 'famale') comment '性別',
  score tinyint comment '入學成績',
  class_id int(11) comment '班級',
  createTime timestamp default current_timestamp comment '建立時間',
  primary key (id),
  foreign key (class_id) references class (id)
) comment '學生表';

-- 根據舊錶建立新表
create table student_copy as select * from student;

-- 刪除 age 列
alter table student drop column age;

-- 新增 age 列
alter table student add column age smallint;

-- 刪除學生表
drop table student;
複製程式碼

檢視

檢視是一種虛擬的表，便於更好地在多個表中檢索資料，檢視也可以作寫操作，不過最好作為只讀。在需要多個表聯接的時候可以使用檢視。

create view v_student_with_classname as
select student.name name, class.name class_name
from student left join class
where student.class_id = class.id;

select * from v_student_with_classname; 
複製程式碼

約束

primiry key

任意兩行絕對沒有相同的主鍵，且任一行不會有兩個主鍵且主鍵絕不為空。使用主鍵可以加快索引。
```
alter table student add constraint primary key (id);
複製程式碼
```
foreign key

外來鍵可以保證資料的完整性。有以下兩種情況。
- 插入張三丰5班到student表中會失敗，因為5班在class表中不存在。
- class表刪除3班會失敗，因為陸小鳳和楚留香還在3班。
```
alter table student add constraint foreign key (class_id) references class (id);
複製程式碼
```

unique key

唯一索引保證該列值是唯一的，但可以允許有null。

alter table student add constraint unique key (name);
複製程式碼

check

檢查約束可以使列滿足特定的條件，如果學生表中所有的人的年齡都應該大於0。

不過很可惜mysql不支援，可以使用觸發器代替
```
alter table student add constraint check (age > 0);
複製程式碼
```

index

索引可以更快地檢索資料，但是降低了更新操作的效能。

create index index_on_student_name on student (name);

alter table student add constraint key(name);
複製程式碼

觸發器

開發過程中從來沒有使用過，有可能是我經驗少

可以在插入，更新，刪除行的時候觸發事件。

場景:

資料約束，比如學生的年齡必須大於0
hook，提供資料庫級別的 hook

-- 建立觸發器
-- 比如mysql中沒有check約束，可以使用建立觸發器，當插入資料小於0時，置為0。
create trigger reset_age before insert on student for each row
begin
  if NEW.age < 0 then
    set NEW.age = 0;
  end if;
end;

-- 列印觸發器列表
show triggers;
複製程式碼

儲存過程

開發過程中從來沒有使用過，有可能是我經驗少

儲存過程可以視為一個函式，根據輸入執行一系列的 sql 語句。儲存過程也可以看做對一系列資料庫操作的封裝，一定程度上可以提高資料庫的安全性。

-- 建立儲存過程
create procedure create_student(name varchar(50))
begin
  insert into students(name) values (name);
end;

-- 呼叫儲存過程
call create_student('shanyue');
複製程式碼

SQL 實踐

更多練習可以檢視 leetcode

1. 根據班級學生的分數進行排名，如果分數相等則為同一名次

select id, name, score, (
  select count(distinct score) from student s2 where s2.score >= s1.score
) as rank
from student s1 order by s1.score desc;
複製程式碼

在where以及排序中經常用到的欄位需要新增Btree索引，因此 score 上可以新增索引。

Result:

id	name	score	rank
1	張三	100	1
3	王五	99	2
2	李四	98	3
5	林仙兒	78	4
4	燕七	34	5

參考 leetcode: rank-scores

2. 寫一個函式，獲取第 N 高的分數

create function getNthHighestScore(N int) return int
begin
  declare M int default N-1;
  return (
    select distinct score from student order by score desc limit M, 1;
  )
end;

select getNthHighestScore(2);
複製程式碼

Result:

getNthHighestScore(2)
99

參考 leetcode: nth highset salary

3. 檢索每個班級分數前兩名學生，並顯示排名

select class.id class_id, class.name class_name, s.name student_name, score, rank
from (
  select *, (
    select count(distinct score) from student s2 where s2.score >= s1.score and s2.class_id = s1.class_id
  ) as rank from student s1
) as s left join class on s.class_id = class.id where rank <= 2;

-- 如果不想在from中包含select子句，也可以像如下檢索，不過不顯示排名
select class.id class_id, class.name class_name, s1.name name, score
from student s1 left join class on s1.class_id = class.id
where (select count(*) from student s2 where s2.class_id = s1.class_id and s1.score <= s2.score) <= 2
order by s1.class_id, score desc;
複製程式碼

Result:

class_name	student_name	score	rank
軟體工程	張三	100	1
軟體工程	王五	99	2
市場營銷	燕七	34	2
市場營銷	林仙兒	78	1

FAQ

大多根據 stackoverflow 中瀏覽最多的問題整理而成。

`inner join` 與 `outer join` 的區別是什麼

參考 StackOverflow: what is the difference between inner join and outer join

如何根據一個表的資料更新另一個表

比如以上 student 表儲存著成績，另有一表 score_correct 記憶體因失誤而需修改的學生成績。

在mysql中，可以使用如下語法

update student, score_correct set student.score = score_correct.score where student.id = score_correct.uid;
複製程式碼

索引是如何工作的

簡單來說，索引分為 hash 和 B-Tree 兩種。 hash 查詢的時間複雜度為O(1)。 B-Tree 其實是 B+Tree，一種自平衡多叉搜尋數，自平衡代表每次插入和刪除資料都會需要動態調整樹高，以降低平衡因子。B+Tree 只有葉子節點會儲存資訊，並且會使用連結串列連結起來。因此適合範圍查詢以及排序，不過只能搜尋最左字首，如只能索引以a開頭的姓名，卻無法索引以a結尾的姓名。另外，Everything is trade off。B+Tree的自平衡特性保證能夠快速查詢的同時也降低了更新的效能，需要權衡利弊。

參考 StackOverflow: how dow database indexing work

如何聯接多個行的欄位

在mysql中，可以使用group_concat

select group_concat(name) from student;
複製程式碼

參考 StackOverflow: Concatenate many rows into a single text string

如何在一個sql語句中插入多行資料

values 使用逗號相隔，可以插入多行資料

insert into student(id, name) values (), (), ()
複製程式碼

參考 StackOverflow: Inserting multiple rows in a single SQL query

如何在`select`中使用條件表示式

示例，在student表中，查詢所有人成績，小於60則顯示為0

select id, name, if(score < 60, 0, score) score from student;
複製程式碼

如何找到重複項

姓名與班級唯一，找到姓名與班級的重複項，檢索重複次數與id

select name, class_id, group_concat(id), count(*) times from student
group by name, class_id
having times > 1;
複製程式碼

1:1 Relation 設計的必要性在哪裡

參考 stackoverflow.com/questions/5…

如何刪除重複項並只保留首項

姓名與班級唯一，刪除重複項，只保留首項

# mysql 就簡單很多
delete s1 from student s1, student s2
where s1.name = s2.name and s1.sex = s2.sex and s1.id > s2.id;
複製程式碼

參考 StackOverflow: how can i remove duplicate rows

什麼是SQL隱碼攻擊

如有一條查詢語句為

"select * from (" + table + ");"
複製程式碼

當table取值 student); drop table student; -- 時，語句變為了，會刪掉表，造成攻擊。

"select * from (student); drop table student; --);"
複製程式碼

mysql中單引號，雙引號，反引號有什麼區別

反引號(`) 表示table，column 識別符號。主要用在當表名或者列名為保留字的時候。在其它一些DBMS中，也用[]表示表名和列名。

單引號(') 表示字串。

雙引號(") 預設表示字串，但是當sql_mode為ANSI_QUOTES時，雙引號表示表名或者列名。

參考 StackOverflow: when to use single quotes, double quotes and backticks in mysql

SQL優化指南
2018-12-01
SQL優化
5分鐘 Pipenv 上手指南
2018-12-18
MySQL指南之SQL語句基礎
2019-03-17
MySql
大資料 SQL Boy 脫坑指南
2019-07-17
大資料SQL
教程：Apache Spark SQL入門及實踐指南！
2018-09-12
ApacheSparkSQL
[學習筆記]SQL server完全備份指南
2023-02-21
筆記SQLServer
SQLServer如何查詢近3分鐘最消耗CPU的SQL
2024-05-13
SQLServer
5分鐘教你搭建郵件伺服器的實用指南
2024-04-29
伺服器
兩分鐘搞定阿里SQL面試題：億級表合併
2019-12-18
阿里SQL面試題
AI實用指南：5分鐘搭建你自己的LLM聊天應用
2024-03-27
AI
新書上市 | 流行 SQL 引擎 Presto，官方實戰指南來了
2021-04-15
新書SQLREST
Flowable 6.6.0 BPMN使用者指南 - 17 高階用例 - 17.5 執行自定義SQL（Execute custom SQL）
2021-01-05
SQL
DataGear 變更部署資料庫為SQL Server填坑指南（含轉寫後的SQL server程式碼及SQL server配置檔案）
2021-05-24
資料庫SQLServer
最強最全面的Hive SQL開發指南，超四萬字全面解析
2021-11-30
HiveSQL
(大表小技巧)有時候 2 小時的 SQL 操作，可能只要 1 分鐘
2023-11-02
SQL
小米開源自研智慧SQL優化與改寫工具SOAR使用指南
2018-10-23
SQL優化
請查收這份開發者轉型AI指南 AI頂尖公司分分鐘pick你
2018-06-01
AI
5分鐘安裝Kubernetes+帶你輕鬆安裝istio服務網格指南
2023-12-14
十分鐘搞定 C/C++ 專案自動化構建 —— Xmake 入門指南
2020-10-26
C++
分步指南：教你在20分鐘內使用機器學習建立語言檢測模型！
2018-04-27
機器學習模型
[翻譯]SQL Server 2005 Analysis Services效能指南 Part 1 - 理解查詢構架
2019-01-08
SQLServer
女神節熱點海報設計指南！一分鐘製作女神節精美海報
2022-02-22
3分鐘短文：Laravel查詢構造器，告別手寫SQL的艱苦歲月
2020-10-30
LaravelSQL
sql時間段切分實現每隔x分鐘出一份高速門架車流量
2022-03-01
SQL
系統時鐘與硬體時鐘
2024-10-29
【SQL】17 SQL 檢視（Views）、SQL Date 函式、SQL NULL 值、
2018-07-27
SQLView函式Null
Clock——時鐘
2018-08-10
【SQL】19 SQL函式
2018-07-27
SQL函式
SQL注射/SQL Injection漏洞
2020-08-19
SQL
SQL------SQL效能分析
2024-07-27
SQL
高效資料移動指南 | 如何快速實現資料庫 SQL Server 到 Dameng 的資料同步？
2024-10-21
資料庫SQLServer
實時時鐘、系統時鐘和主機伺服器時鐘的區別
2022-03-03
伺服器
【SQL】13 SQL 別名、SQL 連線(JOIN)、SQL INNER JOIN 關鍵字、SQL LEFT JOIN 關鍵字、SQL RIGHT JOIN 關鍵字、SQL FULL OUTER J
2018-07-26
SQL
【SQL】9 SQL INSERT INTO 語句
2018-07-25
SQL
【SQL】10 SQL UPDATE 語句
2018-07-26
SQL
【SQL】11 SQL DELETE 語句
2018-07-26
SQLdelete
一分鐘sed入門（一分鐘系列）
2018-05-10
藍寶石時鐘與原子鐘有何區別
2022-11-29

30分鐘SQL指南

準備

工具

樣例表

SQL 基礎

術語

檢索資料

排序

資料過濾

計算欄位

資料聚合 (aggregation)

資料分組

子查詢

聯接

插入資料

修改資料

建立表與更新表

檢視

約束

觸發器

儲存過程

SQL 實踐

1. 根據班級學生的分數進行排名，如果分數相等則為同一名次

2. 寫一個函式，獲取第 N 高的分數

3. 檢索每個班級分數前兩名學生，並顯示排名

FAQ

inner join 與 outer join 的區別是什麼

如何根據一個表的資料更新另一個表

索引是如何工作的

如何聯接多個行的欄位

如何在一個sql語句中插入多行資料

如何在select中使用條件表示式

如何找到重複項

1:1 Relation 設計的必要性在哪裡

如何刪除重複項並只保留首項

什麼是SQL隱碼攻擊

mysql中單引號，雙引號，反引號有什麼區別

相關文章

`inner join` 與 `outer join` 的區別是什麼

如何在`select`中使用條件表示式