[LeetCode] Regular Expression Matching 正規表示式匹配

Grandyang發表於2015-04-27

原文網址 : https://www.cnblogs.com/grandyang/p/4461713.html

Given an input string (s) and a pattern (p), implement regular expression matching with support for '.' and '*'.

'.' Matches any single character.
'*' Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).

Note:

s could be empty and contains only lowercase letters a-z.
p could be empty and contains only lowercase letters a-z, and characters like . or *.

Example 1:

Input:
s = "aa"
p = "a"
Output: false
Explanation: "a" does not match the entire string "aa".

Example 2:

Input:
s = "aa"
p = "a*"
Output: true
Explanation: '*' means zero or more of the precedeng element, 'a'. Therefore, by repeating 'a' once, it becomes "aa".

Example 3:

Input:
s = "ab"
p = ".*"
Output: true
Explanation: ".*" means "zero or more (*) of any character (.)".

Example 4:

Input:
s = "aab"
p = "c*a*b"
Output: true
Explanation: c can be repeated 0 times, a can be repeated 1 time. Therefore it matches "aab".

Example 5:

Input:
s = "mississippi"
p = "mis*is*p*."
Output: false

這道求正規表示式匹配的題和那道 Wildcard Matching 的題很類似，不同點在於*的意義不同，在之前那道題中，*表示可以代替任意個數的字元，而這道題中的*表示之前那個字元可以有0個，1個或是多個，就是說，字串a*b，可以表示b或是aaab，即a的個數任意，這道題的難度要相對之前那一道大一些，分的情況的要複雜一些，需要用遞迴Recursion來解，大概思路如下：

- 若p為空，若s也為空，返回true，反之返回false。

- 若p的長度為1，若s長度也為1，且相同或是p為'.'則返回true，反之返回false。

- 若p的第二個字元不為*，若此時s為空返回false，否則判斷首字元是否匹配，且從各自的第二個字元開始呼叫遞迴函式匹配。

- 若p的第二個字元為*，進行下列迴圈，條件是若s不為空且首字元匹配（包括p[0]為點），呼叫遞迴函式匹配s和去掉前兩個字元的p（這樣做的原因是假設此時的星號的作用是讓前面的字元出現0次，驗證是否匹配），若匹配返回true，否則s去掉首字母（因為此時首字母匹配了，我們可以去掉s的首字母，而p由於星號的作用，可以有任意個首字母，所以不需要去掉），繼續進行迴圈。

- 返回撥用遞迴函式匹配s和去掉前兩個字元的p的結果（這麼做的原因是處理星號無法匹配的內容，比如s="ab", p="a*b"，直接進入while迴圈後，我們發現"ab"和"b"不匹配，所以s變成"b"，那麼此時跳出迴圈後，就到最後的return來比較"b"和"b"了，返回true。再舉個例子，比如s="", p="a*"，由於s為空，不會進入任何的if和while，只能到最後的return來比較了，返回true，正確）。

解法一：

class Solution {
public:
    bool isMatch(string s, string p) {
        if (p.empty()) return s.empty();
        if (p.size() == 1) {
            return (s.size() == 1 && (s[0] == p[0] || p[0] == '.'));
        }
        if (p[1] != '*') {
            if (s.empty()) return false;
            return (s[0] == p[0] || p[0] == '.') && isMatch(s.substr(1), p.substr(1));
        }
        while (!s.empty() && (s[0] == p[0] || p[0] == '.')) {
            if (isMatch(s, p.substr(2))) return true;
            s = s.substr(1);
        }
        return isMatch(s, p.substr(2));
    }
};

上面的方法可以寫的更加簡潔一些，但是整個思路還是一樣的，我們先來判斷p是否為空，若為空則根據s的為空的情況返回結果。當p的第二個字元為*號時，由於*號前面的字元的個數可以任意，可以為0，那麼我們先用遞迴來呼叫為0的情況，就是直接把這兩個字元去掉再比較，或者當s不為空，且第一個字元和p的第一個字元相同時，我們再對去掉首字元的s和p呼叫遞迴，注意p不能去掉首字元，因為*號前面的字元可以有無限個；如果第二個字元不為*號，那麼我們就老老實實的比較第一個字元，然後對後面的字串呼叫遞迴，參見程式碼如下：

解法二：

class Solution {
public:
    bool isMatch(string s, string p) {
        if (p.empty()) return s.empty();
        if (p.size() > 1 && p[1] == '*') {
            return isMatch(s, p.substr(2)) || (!s.empty() && (s[0] == p[0] || p[0] == '.') && isMatch(s.substr(1), p));
        } else {
            return !s.empty() && (s[0] == p[0] || p[0] == '.') && isMatch(s.substr(1), p.substr(1));
        }
    }
};

我們也可以用DP來解，定義一個二維的DP陣列，其中dp[i][j]表示s[0,i)和p[0,j)是否match，然後有下面三種情況(下面部分摘自這個帖子)：

1. P[i][j] = P[i - 1][j - 1], if p[j - 1] != '*' && (s[i - 1] == p[j - 1] || p[j - 1] == '.');
2. P[i][j] = P[i][j - 2], if p[j - 1] == '*' and the pattern repeats for 0 times;
3. P[i][j] = P[i - 1][j] && (s[i - 1] == p[j - 2] || p[j - 2] == '.'), if p[j - 1] == '*' and the pattern repeats for at least 1 times.

解法三：

class Solution {
public:
    bool isMatch(string s, string p) {
        int m = s.size(), n = p.size();
        vector<vector<bool>> dp(m + 1, vector<bool>(n + 1, false));
        dp[0][0] = true;
        for (int i = 0; i <= m; ++i) {
            for (int j = 1; j <= n; ++j) {
                if (j > 1 && p[j - 1] == '*') {
                    dp[i][j] = dp[i][j - 2] || (i > 0 && (s[i - 1] == p[j - 2] || p[j - 2] == '.') && dp[i - 1][j]);
                } else {
                    dp[i][j] = i > 0 && dp[i - 1][j - 1] && (s[i - 1] == p[j - 1] || p[j - 1] == '.');
                }
            }
        }
        return dp[m][n];
    }
};

類似題目：

Wildcard Matching

參考資料：

https://leetcode.com/problems/regular-expression-matching/

https://leetcode.com/problems/regular-expression-matching/discuss/5684/9-lines-16ms-c-dp-solutions-with-explanations

https://leetcode.com/problems/regular-expression-matching/discuss/5665/my-concise-recursive-and-dp-solutions-with-full-explanation-in-c

LeetCode All in One 題目講解彙總(持續更新中...)

Leetcode 10 Regular Expression Matching
2018-10-06
LeetCodeExpress
[LeetCode Python3]10. Regular Expression Matching手把手詳解——正規表示式(一)
2020-10-09
LeetCodePythonExpress
leetcode - 正規表示式匹配
2020-12-04
LeetCode
LeetCode - 解題筆記 - 10- Regular Expression Matching
2020-12-19
LeetCode筆記Express
Leetcode 10. 正規表示式匹配
2024-10-07
LeetCode
leetcode題目10之正規表示式匹配
2018-06-02
LeetCode
正規表示式匹配
2020-12-27
正規表示式匹配原理
2018-09-07
字串——正規表示式匹配
2018-05-09
字串
python 正規表示式匹配
2024-04-19
Python
正規表示式多行匹配
2020-05-18
匹配字母正規表示式
2020-03-28
匹配正整數正規表示式
2020-04-03
LeetCode-10. 正規表示式匹配（Python-re包）
2018-05-03
LeetCodePython
正規表示式支配匹配模式
2018-08-23
模式
JavaScript匹配中文正規表示式
2018-05-28
JavaScript
匹配護照正規表示式
2020-02-16
匹配小數正規表示式
2020-02-16
匹配負數正規表示式
2020-04-08
匹配整數正規表示式
2020-04-05
匹配自然數正規表示式
2020-04-05
匹配航班號正規表示式
2020-04-11
正規表示式匹配漢字
2024-11-25
10. 正規表示式匹配
2018-04-06
正規表示式的多行匹配
2018-03-22
正規表示式匹配問題
2022-05-31
正規表示式同時匹配中英文及常用正規表示式
2022-03-19
轉|正規表示式之匹配中文
2018-11-01
匹配浮點數正規表示式
2020-03-18
匹配二級域名正規表示式
2020-03-18
匹配車牌號正規表示式
2020-02-16
匹配純數字正規表示式
2020-02-16
匹配html標籤正規表示式
2020-02-17
HTML
匹配400電話正規表示式
2020-04-09
匹配 XML 檔案正規表示式
2020-04-12
XML
匹配 HTML 標籤正規表示式
2020-04-12
HTML
匹配空白字元正規表示式
2020-03-22
字元
匹配空行正規表示式程式碼
2018-04-16
JZ-052-正規表示式匹配
2022-01-16

[LeetCode] Regular Expression Matching 正規表示式匹配

相關文章