【演算法框架套路】滑動視窗演算法：匹配子串

雪山飛豬發表於2021-07-30

原文網址 : https://www.cnblogs.com/chenqionghe/p/15076219.html

演算法框架

滑動視窗演算法

《程式碼大全》推薦先用虛擬碼來寫框架，從最上層思考可以將抽象能力最大化，不會先陷入任何程式語言的實現細節中，通俗地說就是在藍圖層面解決問題。

滑動視窗演算法非常適用用來查詢陣列連續區間，核心就是：

while迴圈巢狀while迴圈
視窗收縮
視窗匹配

下面我們寫出虛擬碼框架套路，並用這個套路來解相應的題，該思路來自labuladong的演算法小抄，我自己改成了個人覺得更通用的版本，只需要實現is_need_shrink和is_match方法即可。

注意：先搞出來，再談優化，別一開始就糾結是不是重複呼叫了，搞出來了這都簡單

框架套路

求最小視窗（縮小後更新結果集）

結果集=[]
left=0 
right=0
end = 陣列大小
while right < end:
    right++;
    while 視窗需要收縮：
        if 視窗滿足要求:
            結果集.新增([left,right])
        left++;
return 結果集

求最大視窗（縮小前更新結果集）

結果集=[]
left=0 
right=0
end = 陣列大小
while right < end:
    right++;
    while 視窗需要收縮：
        left++;
    結果集.新增([left,right])
return 結果集

實現大同小異，但是python程式碼幾乎都是最少的，以下都用python實現

python翻譯框架套路

求最小視窗

def min_window(array):
    left = 0
    right = 0
    end = len(array)
    res = []
    while right < end:
        right += 1
        while is_need_shrink():
            if is_match():
                res.append([left, right])  # 在視窗縮小前更新
            left += 1
    return res

# 視窗需要收縮 todo 
def is_need_shrink():
    return True

# 視窗滿足要求 todo 
def is_match():
    return True

求最大視窗

def max_window(array):
    left = 0
    right = 0
    end = len(array)
    res = []
    while right < end:
        right += 1
        while is_need_shrink():
            left += 1
        if is_match():
            res.append([left, right])  # 在視窗擴大後更新
    return res

# 視窗需要收縮 todo 
def is_need_shrink():
    return True

# 視窗滿足要求 todo 
def is_match():
    return True

res相當於新增了所有滿足要求的[left, right]
1.is_need_shrink代表要收縮視窗
2.is_match函式代表視窗滿足要求

我們大多時候只需要改這個兩個函式即可

示例演算法題

最小覆蓋子串

題目

minimum-window-substring

給你一個字串 s 、一個字串 t 。返回 s 中涵蓋 t 所有字元的最小子串。如果 s 中不存在涵蓋 t 所有字元的子串，則返回空字串 "" 。

注意：
對於 t 中重複字元，我們尋找的子字串中該字元數量必須不少於 t 中該字元數量。
如果 s 中存在這樣的子串，我們保證它是唯一的答案。
 
示例 1：
輸入：s = "ADOBECODEBANC", t = "ABC"
輸出："BANC"

示例 2：
輸入：s = "a", t = "a"
輸出："a"
示例 3:

輸入: s = "a", t = "aa"
輸出: ""
解釋: t 中兩個字元 'a' 均應包含在 s 的子串中，
因此沒有符合條件的子字串，返回空字串。

實現

# 最小覆蓋子串，用了min_window框架
def minimum_window_substring(s, t):
    left = 0
    right = 0
    end = len(s)
    res = []
    while right < end:
        right += 1
        while is_need_shrink(s, left, right, t):
            if is_match(s, left, right, t):
                res.append([left, right])
            left += 1
    return res


# 視窗需要收縮。完全匹配的時候收縮，和is_match效果一樣
def is_need_shrink(s, left, right, t):
    return is_match(s, left, right, t)


# 視窗已經匹配（當need_map的字串數量和window_map的字串數量完全匹配時）
def is_match(s, left, right, t):
    need_map = {}  # 構造需要匹配t的字串的數量字典
    for c in t:
        need_map[c] = need_map.get(c, 0) + 1

    need_cnt = len(need_map)  # 需要匹配的數量
    window_map = {}  # 記錄視窗已經匹配的字串數量
    match_cnt = 0  # 記錄已經滿足need_map的數量

    for c in s[left:right]:
        if c not in need_map:
            continue
        window_map[c] = window_map.get(c, 0) + 1
        if window_map[c] == need_map[c]:  # 如果數量相等，說明已經匹配
            match_cnt += 1
    return match_cnt == need_cnt


if __name__ == '__main__':
    s = "ADOBECODEBANC"
    t = "ABC"
    res = minimum_window_substring(s, t)

    # 在結果集中計算最小的，即為最小子串
    min_len = len(s)
    answer = ""
    for v in res:
        left, right = v[0], v[1]
        if right - left < min_len:
            min_len = right - left
            answer = s[left:right]

    if min_len == len(s):
        print("")
    else:
        print(answer)

執行輸出如下

字串全排列子串

題目

permutation-in-string

給你兩個字串 s1 和 s2 ，寫一個函式來判斷 s2 是否包含 s1 的排列。
換句話說，s1 的排列之一是 s2 的 子串 。

示例 1：
輸入：s1 = "ab" s2 = "eidbaooo"
輸出：true
解釋：s2 包含 s1 的排列之一 ("ba").

示例 2：
輸入：s1= "ab" s2 = "eidboaoo"
輸出：false

實現

# 字串全排列子串，用了min_window框架
def permutation_in_string(s, t):
    left = 0
    right = 0
    end = len(s)
    res = []
    while right < end:
        right += 1
        while is_need_shrink(s, left, right, t):
            if is_match(s, left, right, t):
                res.append([left, right])
            left += 1
    return res


# 視窗需要收縮.視窗大於等於t長度時需要收縮
def is_need_shrink(s, left, right, t):
    if right - left >= len(t):
        return True
    return False


# 視窗已經匹配。當need_map和window_map的所有字串計數相同時
def is_match(s, left, right, t):
    need_map = {}  # 構造需要匹配t的字串的數量字典
    for c in t:
        need_map[c] = need_map.get(c, 0) + 1

    need_cnt = len(need_map)  # 需要匹配的數量

    window_map = {}  # 記錄視窗已經匹配的字串數量
    match_cnt = 0  # 記錄已經滿足need_map的數量

    for c in s[left:right]:
        if c not in need_map:
            return False
        window_map[c] = window_map.get(c, 0) + 1
        if window_map[c] == need_map[c]:  # 如果數量相完成等，說明匹配了c字串
            match_cnt += 1
    return match_cnt == need_cnt


if __name__ == '__main__':
    s = "eidbaooo"
    t = "ab"
    res = permutation_in_string(s, t)
    for v in res:
        print(f'{v[0]}~{v[1]} {s[v[0]:v[1]]}')

執行輸出如下：

找出所有字母異位詞

題目

find-all-anagrams-in-a-string/

給定兩個字串 s 和 p，找到 s 中所有 p 的 異位詞 的子串，返回這些子串的起始索引。不考慮答案輸出的順序。
異位詞 指字母相同，但排列不同的字串。

示例 1:

輸入: s = "cbaebabacd", p = "abc"
輸出: [0,6]
解釋:
起始索引等於 0 的子串是 "cba", 它是 "abc" 的異位詞。
起始索引等於 6 的子串是 "bac", 它是 "abc" 的異位詞。
 示例 2:

輸入: s = "abab", p = "ab"
輸出: [0,1,2]
解釋:
起始索引等於 0 的子串是 "ab", 它是 "ab" 的異位詞。
起始索引等於 1 的子串是 "ba", 它是 "ab" 的異位詞。
起始索引等於 2 的子串是 "ab", 它是 "ab" 的異位詞。

這和上面的字串排列子串完全相同套路，不同的是上面的只需要一個解即可，這個需要所有解。

實現

# 查詢所有異位詞，用了min_window框架
def find_all_anagrams_in_a_string(s, t):
    left = 0
    right = 0
    end = len(s)
    res = []
    while right < end:
        right += 1
        while is_need_shrink(s, left, right, t):
            if is_match(s, left, right, t):
                res.append([left, right])
            left += 1
    return res


# 視窗需要收縮.視窗大於等於t長度時需要收縮
def is_need_shrink(s, left, right, t):
    if right - left >= len(t):
        return True
    return False


# 視窗已經匹配。當need_map和window_map的所有字串計數相同時
def is_match(s, left, right, t):
    need_map = {}  # 構造需要匹配t的字串的數量字典
    for c in t:
        need_map[c] = need_map.get(c, 0) + 1

    need_cnt = len(need_map)  # 需要匹配的數量

    window_map = {}  # 記錄視窗已經匹配的字串數量
    match_cnt = 0  # 記錄已經滿足need_map的數量

    for c in s[left:right]:
        if c not in need_map:
            return False
        window_map[c] = window_map.get(c, 0) + 1
        if window_map[c] == need_map[c]:  # 如果數量相完成等，說明匹配了c字串
            match_cnt += 1
    return match_cnt == need_cnt


if __name__ == '__main__':
    s = "cbaebabacd"
    t = "abc"
    res = find_all_anagrams_in_a_string(s, t)
    for v in res:
        print(f'{v[0]}~{v[1]} {s[v[0]:v[1]]}')

執行輸出如下

最長無重複子串

題目

longest-substring-without-repeating-characters

給定一個字串 s ，請你找出其中不含有重複字元的 最長子串 的長度。

示例 1:
輸入: s = "abcabcbb"
輸出: 3 
解釋: 因為無重複字元的最長子串是 "abc"，所以其長度為 3。

示例 2:
輸入: s = "bbbbb"
輸出: 1
解釋: 因為無重複字元的最長子串是 "b"，所以其長度為 1。

示例 3:
輸入: s = "pwwkew"
輸出: 3
解釋: 因為無重複字元的最長子串是 "wke"，所以其長度為 3。
     請注意，你的答案必須是 子串 的長度，"pwke" 是一個子序列，不是子串。

示例 4:
輸入: s = ""
輸出: 0

名字叫最長，這裡需要的是最大視窗框架，也就是在視窗收縮前更新結果集

實現

# 最長無重複子串，用了max_window框架
def longest_substring_without_repeating_characters(s):
    res = []
    left = 0
    right = 0
    end = len(s)
    while right < end:
        right += 1

        while is_need_shrink(s, left, right):
            left += 1

        if is_match(s, left, right):
            res.append([left, right])

    return res


# 視窗需要收縮。當有重複子串時，和is_match正好相反
def is_need_shrink(s, left, right):
    return not is_match(s, left, right)


# 視窗已經匹配。沒有重複子串
def is_match(s, left, right):
    substr = s[left:right]
    # 計算每個字串個數
    window_map = {}
    for c in substr:
        window_map[c] = window_map.get(c, 0) + 1
        # 數量大於1說明有重複
        if window_map[c] > 1:
            return False
    return True


if __name__ == '__main__':
    s = "abcabcbb"
    res = longest_substring_without_repeating_characters(s)

    # 在結果集中計算最小的
    max_len = 0
    answer = ""
    for v in res:
        left, right = v[0], v[1]
        if right - left > max_len:
            max_len = right - left
            answer = s[left:right]
    print(answer)

執行輸出如下

優化

先搞出來了，我們就可以優化了

比如is_match和is_need_shrink可能相同，用一個就行了
比如迴圈裡面重複計算need_map構造字典的操作，避免重複計算，可以提取到函式外部
比如有時候不需要所有的解，可以直接在is_match匹配時return

程式碼都搞出來了，這種優化都相對簡單，套路才是最重要的，就是這樣，giao~

7、滑動視窗套路演算法框架——Go語言版
2021-12-25
演算法框架Go
滑動視窗演算法
2021-09-09
演算法
滑動視窗演算法思路
2021-08-11
演算法
Sentinel滑動視窗演算法
2020-12-30
演算法
滑動視窗演算法（Sliding Window Algorithm）
2024-03-30
演算法Go
滑動視窗（Sliding Window）演算法介紹
2019-02-26
演算法
【演算法】滑動視窗三步走
2021-03-27
演算法
演算法~利用zset實現滑動視窗限流
2024-04-29
演算法
使用 Redis 實現限流——滑動視窗演算法
2024-04-25
Redis演算法
leetcode刷題記錄：演算法（三）滑動視窗演算法
2020-10-28
LeetCode演算法
滑動視窗法——子串相關問題
2024-07-28
[分散式限流] 滑動視窗演算法的 Golang 實現
2019-11-27
分散式演算法Golang
演算法題：返回滑動視窗中的最大值
2019-11-28
演算法
滑動視窗演算法基本原理與實踐
2020-08-16
演算法
《面試必備演算法》系列滑動視窗入門
2020-10-14
面試演算法
【完虐演算法系列】「字串 – 滑動視窗」覆盤總結
2021-10-22
演算法字串
LC演算法技巧總結（二）：雙指標和滑動視窗技巧
2020-09-11
演算法指標
Sentinel 原理-滑動視窗
2019-01-09
滑動視窗專題
2020-11-10
細聊滑動視窗
2024-10-20
滑動視窗3.替換後最長重複字元子串
2020-12-19
字元
mysql視窗函式中的滑動視窗
2020-11-21
MySql函式
Flink的滾動視窗、會話視窗、滑動視窗及其應用
2020-10-13
會話
用滑動視窗來解決最長無重複子串問題
2019-05-12
有點難度，幾道和「滑動視窗」有關的演算法面試題
2019-04-22
演算法面試題
Sentinel-Go 原始碼系列（三）滑動時間視窗演算法的工程實現
2021-12-20
Go原始碼演算法
滑動視窗問題總結
2024-09-23
滑動視窗與雙指標
2024-10-26
指標
滑動視窗法——Leetcode例題
2022-03-31
LeetCode
TCP 流量控制-滑動視窗
2021-09-09
TCP
WeetCode2滑動視窗系列
2022-11-27
滑動視窗的最大值
2020-12-09
【演算法框架套路】回溯演算法（暴力窮舉的藝術）
2021-08-24
演算法框架
4、BFS演算法套路框架——Go語言版
2021-12-19
演算法框架Go
自己實現一個滑動視窗
2018-12-12
滑動視窗（Sliding Window）技巧總結
2020-09-01
「LeetCode Top100」之滑動視窗
2024-08-11
LeetCode
239. 滑動視窗最大值
2024-07-18

【演算法框架套路】滑動視窗演算法：匹配子串

滑動視窗演算法

框架套路

求最小視窗（縮小後更新結果集）

求最大視窗（縮小前更新結果集）

python翻譯框架套路

求最小視窗

求最大視窗

示例演算法題

最小覆蓋子串

題目

實現

字串全排列子串

題目

實現

找出所有字母異位詞

題目

實現

最長無重複子串

題目

實現

優化

相關文章