正則_9

zhengyshan發表於2018-06-04

原文網址 : https://blog.csdn.net/zhengyshan/article/details/80573884

正則介紹

grep_1&2&3

什麼是正則

正規表示式，又稱規則表示式。（英語：Regular Expression，在程式碼中常簡寫為regex、regexp或RE），電腦科學的一個概念。
正規表示式通常被用來檢索、替換那些符合某個模式(規則)的文字。許多程式設計語言都支援利用正規表示式進行字串操作。
grep（global search regular expression(RE) and print out the line，全面搜尋正規表示式並把行列印出來）是一種強大的文字搜尋工具，它能使用正規表示式搜尋文字，並把匹配的行列印出來。

正則就是一串又規律的字串
掌握好正則對於編寫shell指令碼有很大幫助
各種程式語言中都有正則，原理是一樣的
本章將要需欸寫grep/egrep、sed、awk

grep

選項

grep [-cinvABC] 'word' filename

[root@zyshanlinux-01 grep]# grep 'nologin' passwd
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin

-c或--count 行數

[root@zyshanlinux-01 grep]# grep -c 'nologin' passwd
16

-i或--ignore-case 不區分大小寫，非常耗費時間儘量不用。

[root@zyshanlinux-01 grep]# grep -n 'nologin' passwd
3:daemon:x:2:2:daemon:/sbin:/sbin/nologin
4:adm:x:3:4:adm:/var/adm:/sbin/nologin
[root@zyshanlinux-01 grep]# grep -in 'nologin' passwd
2:bin:x:1:1:bin:/bin:/sbin/Nologin
3:daemon:x:2:2:daemon:/sbin:/sbin/nologin

-n或--line-number顯示行號

[root@zyshanlinux-01 grep]# grep -n 'nologin' passwd
2:bin:x:1:1:bin:/bin:/sbin/nologin
3:daemon:x:2:2:daemon:/sbin:/sbin/nologin

-v或--revert-match取反

[root@zyshanlinux-01 grep]# grep -ivn 'nologin' passwd
1:root:x:0:0:root:/root:/bin/bash
6:sync:x:5:0:sync:/sbin:/bin/sync

-r或--recursive遍歷所有子目錄

[root@zyshanlinux-01 ~]# grep -r 'root' /etc/
[root@zyshanlinux-01 ~]# grep 'root' /etc/
grep: /etc/: 是一個目錄

-A後面跟數字，過濾處符合要求的行以及下面n行

[root@zyshanlinux-01 grep]# grep -nA2 'root' passwd
1:root:x:0:0:root:/root:/bin/bash
2-bin:x:1:1:bin:/bin:/sbin/Nologin
3-daemon:x:2:2:daemon:/sbin:/sbin/nologin
--
10:operator:x:11:0:operator:/root:/sbin/nologin
11-games:x:12:100:games:/usr/games:/sbin/nologin
12-ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin

-B同上，過濾處符合要求的行以及上面n行

[root@zyshanlinux-01 grep]# grep -nB2 'root' passwd
1:root:x:0:0:root:/root:/bin/bash
--
8-halt:x:7:0:halt:/sbin:/sbin/halt
9-mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
10:operator:x:11:0:operator:/root:/sbin/nologin

-C同上，同時過濾處符合要求的行以及上下各n行

[root@zyshanlinux-01 grep]# grep -nC2 'root' passwd
1:root:x:0:0:root:/root:/bin/bash
2-bin:x:1:1:bin:/bin:/sbin/Nologin
3-daemon:x:2:2:daemon:/sbin:/sbin/nologin
--
8-halt:x:7:0:halt:/sbin:/sbin/halt
9-mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
10:operator:x:11:0:operator:/root:/sbin/nologin
11-games:x:12:100:games:/usr/games:/sbin/nologin
12-ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin

注：grep僅僅是實現正則的工具而已，並不是正則。

正規表示式grep/egrep示例

[root@zyshanlinux-01 grep]# grep '[0-9]' passwd

[root@zyshanlinux-01 grep]# grep -v '[0-9]' passwd

[root@zyshanlinux-01 grep]# grep -nv '^#' inittab ##在[]外面的^代表以什麼開頭

[root@zyshanlinux-01 grep]# grep -nv '^[^0-9]' inittab ##在[]裡面的^代表非，[^0-9]代表非數字

[root@zyshanlinux-01 grep]# grep 'r.o' passwd ##r.o這個點代表任意一個字元

[root@zyshanlinux-01 grep]# grep 'o*o' passwd ##星號代表重複前面一個字元0次或多次

[root@zyshanlinux-01 grep]# grep 'zyshan.*bash' passwd ##點星匹配所有字元，包括空行

[root@zyshanlinux-01 grep]# grep 'o\{2\}' passwd ##要脫義，花括號裡面的數字代表重複前面的字元次數，花括號必須脫義，不脫義就不會匹配，僅代表一個花括號而已
[root@zyshanlinux-01 grep]# egrep 'o{2}' passwd ##egrep不用脫義
[root@zyshanlinux-01 grep]# grep -E 'o{2}' passwd ##加-E也不用脫義
[root@zyshanlinux-01 grep]# grep -E '(oo){2}' passwd ##用圓括號代表一個整體，相當於是四個0

[root@zyshanlinux-01 grep]# grep 'o\+o' passwd ##需脫義，加號代表重複前面的字元1次或多次，沒0次

[root@zyshanlinux-01 grep]# grep 'o?t' passwd ##問號代表重複前面字元0次或1次

[root@zyshanlinux-01 grep]# grep -E 'root|nologin|997|Bus' passwd ##豎線代表或者

sed_4&5

sed是一種流編輯器，它是文字處理中非常中的工具，能夠完美的配合正規表示式使用，功能不同凡響。處理時，把當前處理的行儲存在臨時緩衝區中，稱為“模式空間”（pattern space），接著用sed命令處理緩衝區中的內容，處理完成後，把緩衝區的內容送往螢幕。接著處理下一行，這樣不斷重複，直到檔案末尾。檔案內容並沒有改變，除非你使用重定向儲存輸出。Sed主要用來自動編輯一個或多個檔案；簡化對檔案的反覆操作；編寫轉換程式等。

選項

-n或--quiet或——silent：僅顯示script處理後的結果；
-e<script>或--expression=<script>：以選項中的指定的script來處理輸入的文字檔案；

引數

檔案：指定待處理的文字檔案列表。

sed命令

d ：刪除，因為是刪除啊，所以 d 後面通常不接任何咚咚；
s ：取代，可以直接進行取代的工作哩！通常這個 s 的動作可以搭配正規表示法！例如 1,20s/old/new/g 就是啦！

sed替換標記

g 表示行內全面替換。
p 表示列印行。

匹配

[root@zyshanlinux-01 sed]# sed -n '/root/'p test.txt

[root@zyshanlinux-01 sed]# sed -n '/r.t/'p test.txt ##點代表任意一個字元

[root@zyshanlinux-01 sed]# sed -n '/r*t/'p test.txt ##星重複前面一個字元0次或多次

[root@zyshanlinux-01 sed]# sed -nr '/o+t/'p test.txt ##加-r後加號不用加脫義號

[root@zyshanlinux-01 sed]# sed -nr '/o{2}/'p test.txt ##匹配兩次o

[root@zyshanlinux-01 sed]# sed -nr '/root|bus/'p test.txt ##豎線代表或者

列印特定行

[root@zyshanlinux-01 sed]# sed -n '2'p test.txt ##列印前2行

[root@zyshanlinux-01 sed]# sed -n '2,5'p test.txt ##列印第2到第5行

[root@zyshanlinux-01 sed]# sed -n '25,$'p test.txt ##列印25行到末行

[root@zyshanlinux-01 sed]# sed -n '1,$'p test.txt ##列印全部

[root@zyshanlinux-01 sed]# sed -e '1'p -e '/bus/'p -n test.txt ##多重匹配
[root@zyshanlinux-01 sed]# sed -e '1'p -e '/root/'p -e '/oo*/'p -n test.txt ##可以繼續寫多個

[root@zyshanlinux-01 sed]# sed -n '/bus/'Ip test.txt ##加I選項可以不區分大小寫匹配Bus;djisj2;434dbus❌81:81:System message bus:/:/sbin/nologin

[root@zyshanlinux-01 sed]# sed '1,25'd test.txt ##加d去掉前面25行後，列印剩下的行user5❌1010:1011::/home/user5:/bin/bashuser9❌1013:1013::/home/user9:/bin/bashuser19❌1014:1015::/home/user19:/bin/bashuser4❌1015:1016::/home/user4:/bin/bash[root@zyshanlinux-01 sed]# wc -l test.txt ##原始檔沒有改動，行數沒變29 test.txt[root@zyshanlinux-01 sed]# sed -i '1,25'd test.txt ##加選項-i，刪除前面25行[root@zyshanlinux-01 sed]# wc -l test.txt ##確認行數變少了4 test.txt

[root@zyshanlinux-01 sed]# sed -i '/user5/'d test.txt ##刪除指定字元相關行

查詢替換

[root@zyshanlinux-01 sed]# sed '1,10s/root/toor/g' test.txt ##s準備替換動作，在前10行把root替換成toor,g是全域性替換

[root@zyshanlinux-01 sed]# sed '1,10s/ro+/r/g' test.txt |head ##在前10行把ro+替換成r，接的加號脫義

##前10行內容通過管道符過濾，想用圓括號需要在前面加-r或者脫義號，s前面不指定範圍就指全部
[root@zyshanlinux-01 sed]# head test.txt |sed -r 's/([^:]+):(.*):([^:]+)/\3:\2:\1/'
##第一個([^:]+):代表冒號前非冒號的字串，第二個(.*)貪婪匹配所有的字元，第三個:([^:]+)代表冒號後面的非冒號字串，這三個用圓括號括起來是後面需要呼叫它，第一個圓括號用\1表示，第二個圓括號用\2表示，第三個圓括號用\3表示；目的是把第一個和第三個調換，所以要寫成\3:\2:\1，///替換格式。

[root@zyshanlinux-01 sed]# head test.txt |sed -r 's/[a-zA-Z]//g' ##刪除全部的字母，替換成空就好

##在檔案內容每行前加上aaa:用圓括號加-r，用&代表圓括號部分,修改的內容放到&前面
[root@zyshanlinux-01 sed]# head test.txt |sed -r 's/(.*)/aaa:&/'aaa:root❌0:0:root:/root:/bin/bash
##在檔案內容每行前加上bbb:用圓括號加-r，也可以用\1代表圓括號部分
[root@zyshanlinux-01 sed]# head test.txt |sed -r 's/(.*)/bbb:\1/'bbb:root❌0:0:root:/root:/bin/bash

sed 's/^.*$/123&/' test.txt ##與上面類似

awk工具_6&7

awk是一種程式語言，用於在linux/unix下對文字和資料進行處理。資料可以來自標準輸入(stdin)、一個或多個檔案，或其它命令的輸出。它支援使用者自定義函式和動態正規表示式等先進功能，是linux/unix下的一個強大程式設計工具。它在命令列中使用，但更多是作為指令碼來使用。awk有很多內建的功能，比如陣列、函式等，這是它和C語言的相同之處，靈活性是awk最大的優勢。

格式：awk 'BEGIN{} {} END{}' filename

awk語法結構： awk -F ':' 'BEGIN{語句} {if(條件){語句1;語句2;語句3} } END{語句}' filename

awk內建變數（預定義變數）

[A] OFS 輸出欄位分隔符（預設值是一個空格）。[A] NR 表示記錄數，在執行過程中對應於當前的行號。[A] NF 表示欄位數，在執行過程中對應於當前的欄位數。

[root@zyshanlinux-01 awk]# awk -F ':' '{print $1,$2,$3}' test.txt ##列印每行的第一二三段
[root@zyshanlinux-01 awk]# awk -F ':' '{print $1"#"$2"#"$3}' test.txt ##字元#分割每段，字元帶雙引號
[root@zyshanlinux-01 awk]# awk -F ':' '{print $0}' test.txt ##列印全部
[root@zyshanlinux-01 awk]# awk '{print $0}' test.txt ##不指定分割符，預設從空格或空白開始

awk匹配

[root@zyshanlinux-01 awk]# awk '/oo/' test.txt
[root@zyshanlinux-01 awk]# awk -F ':' '$1 ~ /oo/' test.txt ##符號~代表匹配，$1代表第一段，匹配第1段帶oo的
[root@zyshanlinux-01 awk]# awk -F ':' '$1 ~ /o+/' test.txt ##正規表示式
[root@zyshanlinux-01 awk]# awk -F ':' '$1 ~ /oo+/' test.txt

[root@zyshanlinux-01 awk]# awk -F ':' '/root/ {print $1,$3} /user/ {print $1,$3,$4}' test.txt ##多段表示式一起寫，有root的列印第一三段，有user的列印第一三四段

awk數學運算表示式，運算子前後沒有空格的

[root@zyshanlinux-01 awk]# awk -F ':' '$3==0' test.txt ##第3段等於0的，列印整行
[root@zyshanlinux-01 awk]# awk -F ':' '$3==0 {print $1}' test.txt ##第3段等於0的，列印第1行
[root@zyshanlinux-01 awk]# awk -F ':' '$3>=1000 {print $1}' test.txt ##第3段大於等於1000，列印第1行

[root@zyshanlinux-01 awk]# awk -F ':' '$3>=1000 {print $0}' test.txt ##1000如果針對數字不要加雙引號
[root@zyshanlinux-01 awk]# awk -F ':' '$3>="1000" {print $0}' test.txt ##1000加雙引號是按ASCII排序的，即是字元不是數字

[root@zyshanlinux-01 awk]# awk -F ':' '$3!="/sbin/nologin" {print $0}' test.txt ##不等於

[root@zyshanlinux-01 awk]# awk -F ':' '$3<$4' test.txt
[root@zyshanlinux-01 awk]# awk -F ':' '$3==$4' test.txt
[root@zyshanlinux-01 awk]# awk -F ':' '$3>"5" && $3<"7"' test.txt ##帶雙引號是字元比較
[root@zyshanlinux-01 awk]# awk -F ':' '$3>1000 || $7=="/sbin/nologin"' test.txt ##精準等於
[root@zyshanlinux-01 awk]# awk -F ':' '$3>1000 || $7 ~ /bash/' test.txt ##匹配

##OFS內建變數是用來指定print的分割符
[root@zyshanlinux-01 awk]# awk -F ':' '{OFS="#"} $3>1000 || $7 ~ /bash/ {print $1,$3,$7}' test.txt
[root@zyshanlinux-01 awk]# awk -F ':' '{OFS="#"} {print $1,$3,$7}' test.txt ##不指定條件就全部
[root@zyshanlinux-01 awk]# awk -F ':' '{OFS="#"} {if ($3>1000) {print $1,$2,$3,$4}}' test.txt ##加上修飾符if
##內建變數NR表示行、NF表示段，都是數字
[root@zyshanlinux-01 awk]# awk -F ':' '{print NR":"$0}' test.txt ##在每行開頭列印行的記錄數
[root@zyshanlinux-01 awk]# awk -F ':' '{print NF":"$0}' test.txt ##在每行開頭列印每行的欄位數
[root@zyshanlinux-01 awk]# awk -F ':' 'NR<=10' test.txt ##列印前10行，可以作為判斷條件
[root@zyshanlinux-01 awk]# awk -F ':' 'NR<=10 && $1 ~ /root|sync/' test.txt ##兩個條件一起用
[root@zyshanlinux-01 awk]# awk -F ':' 'NF==6 && $1 ~ /root|sync/' test.txt ##可以針對段進行操作

[root@zyshanlinux-01 awk]# awk -F ':' '{print $NR":"$NF}' test.txt ##對比帶不帶$的規律

[root@zyshanlinux-01 awk]# awk -F ':' '{print $NR":"$NF}' test.txt
root:/bin/bash
x:/sbin/nologin
2:/sbin/nologin
4:/sbin/nologin
lp:/sbin/nologin
/sbin:/bin/sync
/sbin/shutdown:/sbin/shutdown
:/sbin/halt
[root@zyshanlinux-01 awk]# awk -F ':' '{print NR":"NF}' test.txt
1:7
2:7
3:7
4:7
5:7
6:7
7:7
8:7

[root@zyshanlinux-01 awk]# head -n 3 test.txt |awk -F ':' '$1="root"'  ##一個等號賦值，分隔符沒了
root x 0 0 root /root /bin/bash
root x 1 1 bin /bin /sbin/nologin
root x 2 2 daemon /sbin /sbin/nologin
[root@zyshanlinux-01 awk]# head -n 3 test.txt |awk -F ':' '{OFS=":"} $1="root"'  ##用變數分割恢復
root:x:0:0:root:/root:/bin/bash
root:x:1:1:bin:/bin:/sbin/nologin
root:x:2:2:daemon:/sbin:/sbin/nologin

[root@zyshanlinux-01 awk]# awk -F ':' '{(tot=tot+$3)}; END {print tot}' test.txt ##求和，前面的大括號代表所有的組合，直接列出結果，內容由大括號代表了
[root@zyshanlinux-01 awk]# awk -F ':' '(tot=tot+$3); END {print tot}' test.txt ##求和缺少大括號，會把全部內容都列出來，最後是求和結果
[root@zyshanlinux-01 awk]# awk -F ':' '{tot=tot+$3} END {print tot}' test.txt ##簡化
[root@zyshanlinux-01 awk]# awk -F ':' '{tot+=$3} END {print tot}' test.txt ##再簡化

擴充：

正規表示式 http://www.apelearn.com/study_v2/chapter14.html