Python學習手冊之捕獲組和特殊匹配字串

清潔工老闆發表於2018-11-30

原文網址 : https://flycode.co/archives/233183

在上一篇文章中，我們介紹了 Python 的字元類和對元字元進行了深入講解，現在我們介紹 Python 的捕獲組和特殊匹配字串。檢視上一篇文章請點選：https://www.cnblogs.com/dustman/p/10036661.html

捕獲組
可以通過用括號包圍正規表示式的部分來建立組，意味著一個組可以作為元字元 (例如 * 和 ?) 的引數。

import re

pattern = r"python(ice)*"
string1 = "python!"
string2 = "ice"
string3 = "pythonice"

match1 = re.match(pattern,string1)
match2 = re.match(pattern,string2)
match3 = re.match(pattern,string3)

if match1:
 print(match1.group())
 print("match 1")

if match2:
 print(match2.group())
 print("match 2")

if match3:
 print(match3.group())
 print("match 3")

執行結果：

>>>
python
match 1
pythonice
match 3
>>>

上面的例子 (ice) 表示捕獲組。

之前介紹元字元和字元類時，我們都用到了 group 函式訪問捕獲組中的內容。group(0) 或 group() 返回全部匹配，group(n) 呼叫 n 大於 0 返回第 n 組匹配。groups() 返回一個包含所有捕獲組的元組。

import re

pattern = r"j(av)(ap)(yt(h)o)n"
string = "javapythonhtmlmysql"

match = re.match(pattern,string)

if match:
 print(match.group())
 print(match.group(0))
 print(match.group(1))
 print(match.group(2))
 print(match.groups())

執行結果：

>>>
javapython
javapython
av
ap
(`av`, `ap`, `ytho`, `h`)
>>>

捕獲組同時可以巢狀，也就是說一個組可以是另一個組的子集。

有一些特殊的捕獲組，它們叫非捕獲組和命名捕獲組。
命名捕獲組的格式是 (?p<name>…)，其中 name 是組的名稱，…是要匹配的表示式。它們的行為與正常組完全相同，除了可以通過索引訪問還可以通過 group(name) 方式訪問它們。
非捕獲組的格式是 (?:…)。非捕獲組值匹配結果，但不捕獲結果，也不會分配組號，當然也不能在表達式和程式中做進一步處理。

import re

pattern = r"(?P<python>123)(?:456)(789)"
string = "123456789"

match = re.match(pattern,string)

if match:
 print(match.group("python"))
 print(match.groups())

執行結果：

>>>
123
(`123`, `789`)
>>>

或匹配的元字元 |，red|blue 表示匹配 red 或者 blue。

import re

string1 = "python"
string2 = "pyihon"
string3 = "pylhon"
pattern = r"py(t|i)hon"

match1 = re.match(pattern,string1)
match2 = re.match(pattern,string2)
match3 = re.match(pattern,string3)

if match1:
 print(match1.group())
 print("match 1")

if match2:
 print(match2.group())
 print("match 2")

if match3:
 print(match3.group())
 print("match 3")

執行結果：

>>>
python
match 1
pyihon
match 2
>>>

特殊匹配字串
特殊序列
在正規表示式中可以使用各種的捕獲組序列。它們被寫成反斜槓，後面跟著另一個數字字元。
特殊序列是一個反斜槓和一個介於 1 到 99 之間的數字，比如：1。數字自發表示捕獲組的序列，也就是說我們可以在正規表示式裡引用先前的捕獲組。

import re

string1 = "html python"
string2 = "python python"
string3 = "java java"
pattern = r"(.+) 1"

match1 = re.match(pattern,string1)
match2 = re.match(pattern,string2)
match3 = re.match(pattern,string3)

if match1:
 print(match1.group())
 print("match 1")

if match2:
 print(match2.group())
 print("match 2")

if match3:
 print(match3.group())
 print("match 3")

執行結果：

>>>
python python
match 2
java java
match 3
>>>

注意：(.+) 1 不等同於 (.+)(.+)，因為 1 引用第一組的表示式，即匹配表示式本身，而不是正則匹配模式。

正則中還有一些特殊的匹配模式 d, s, 和 w, 它們匹配數字，空白和單詞字元。在 ASCII 模式里正則裡等同 [0-9], [

v] 和 [a-zA-Z0-9], 但是在 Unicode 模式裡 w 匹配一個字。
如果我們把這幾個字母變成大寫 D, S, 和 W, 那麼意味著匹配模式相反。比如: D 匹配非數字。

import re

string1 = "python 2017!"
string2 = "1,00,867!"
string3 = "!@#?"
pattern = r"(D+d)"

match1 = re.match(pattern,string1)
match2 = re.match(pattern,string2)
match3 = re.match(pattern,string3)

if match1:
 print(match1.group())
 print("match 1")

if match2:
 print(match2.group())
 print("match 2")

if match3:
 print(match3.group())
 print("match 3")

執行結果：

>>>
python 2
match 1
>>>

(D+d) 意味著匹配一個或者多個非數字後面跟隨一個數字。

特殊匹配
還有一些特殊的匹配表示式 A, , 和。A 僅匹配字串的開始，在大多數條件下，它的作用等同於在模式中使用 ^。僅匹配字串的結束，在大多數情況下，相等於 $。
匹配一個詞的邊界。一個詞的邊界就是一個詞不被另外一個詞跟隨的位置或者不是另一個詞彙字元前邊的位置。相當於w 和 W 之間有個一個空字串。
B 匹配一個非單詞邊界。它匹配一個前後字元都是相同型別的位置：都是單詞或者都不是單詞。一個字符串的開始和結尾都被認為是非單詞。

import re

string1 = "The dog eat!"
string2 = "<dog>dog<>?"
string3 = "dogeatpython"
pattern = r"(dog)"

search1 = re.search(pattern,string1)
search2 = re.search(pattern,string2)
search3 = re.search(pattern,string3)

if search1:
 print(search1.group())
 print("search 1")

if search2:
 print(search2.group())
 print("search 2")

if search3:
 print(search3.group())
 print("search 3")

執行結果：

>>>
dog
search 1
dog
search 2
>>>

注意：一個匹配詞的邊界並不包含在匹配的內容中，換句話說，一個匹配的詞的邊界的內容的長度是0。(dog) 匹配的結果是 “dog“。

“美滿婚姻並非 “壁人成雙”，而是不完美的一雙學會互相欣賞彼此的差別。”　– 大衛·鮑伊

正則匹配的捕獲組
2020-02-28
Python學習手冊之類和繼承
2018-11-25
Python繼承
Python學習手冊——第二部分型別和運算（1）之字串
2021-11-25
Python型別字串
Python學習手冊之控制結構（二）
2018-11-19
Python
Python學習筆記|Python之特殊方法
2018-12-21
Python筆記
python學習之字串常用方法和格式化字串
2018-09-28
Python字串
python學習手冊（8）
2020-10-20
Python
python學習手冊（10）
2020-10-22
Python
python學習手冊（4）
2020-10-16
Python
Python學習筆記|Python之特殊檔案
2018-12-21
Python筆記
Python學習手冊之Python介紹、基本語法（一）
2018-10-30
Python
【學習筆記】字串匹配
2020-12-03
筆記字串匹配
Leetcode 893. 特殊等價字串組 python 版本
2018-08-26
LeetCode字串Python
shell正則匹配捕獲引用進行IP匹配
2023-05-02
KMP字串匹配學習筆記
2021-04-08
KMP字串匹配筆記
Python 學習之元組
2019-01-27
Python
Python學習之模組
2018-06-16
Python
Python學習之常用模組
2018-06-24
Python
Python 學習之元組列表
2018-07-25
Python
Python學習之 datetime模組
2019-10-11
Python
javascript捕獲組如何使用
2021-09-11
JavaScript
零基礎學習 Python 之字串
2018-12-12
Python字串
Block學習②--block的變數捕獲
2018-07-09
BloC變數
引數匹配模型——Python學習之引數（二）
2020-04-04
模型Python
正則表達選擇分組向後引用及捕獲和非捕獲分組（五）（1000則）
2020-04-23
[Python]-機器學習Python入門《Python機器學習手冊》-01-向量、矩陣和陣列
2022-04-20
Python機器學習矩陣陣列
python異常捕獲
2023-03-06
Python
Python學習手冊(第4版)PDF版
2018-12-18
Python
Python正規表示式實現非捕獲分組
2021-09-11
Python
Python學習之模組與包
2021-09-09
Python
Redux學習手冊
2018-08-18
Redux
引數匹配順序——Python學習之引數（三）
2020-04-04
Python
【python學習手冊】02|使用Python提取中文關鍵詞？
2018-04-20
Python
Python sys模組參考手冊
2019-02-16
Python
Python os模組參考手冊
2019-02-16
Python
C#學習筆記---異常捕獲和變數運算子
2023-10-09
C#筆記變數
python之異常捕獲&清除的列印報錯行
2024-06-20
Python
【轉載】Python字串操作之字串分割與組合
2018-11-13
Python字串

Python學習手冊之捕獲組和特殊匹配字串

相關文章