[LeetCode] Substring with Concatenation of All Words 串聯所有單詞的子串

Grandyang發表於2015-05-22

You are given a string, s, and a list of words, words, that are all of the same length. Find all starting indices of substring(s) in s that is a concatenation of each word in words exactly once and without any intervening characters.

For example, given:
s: "barfoothefoobarman"
words: ["foo", "bar"]

You should return the indices: [0,9].
(order does not matter).

這道題讓我們求串聯所有單詞的子串，就是說給定一個長字串，再給定幾個長度相同的單詞，讓我們找出串聯給定所有單詞的子串的起始位置，還是蠻有難度的一道題。這道題我們需要用到兩個雜湊表，第一個雜湊表先把所有的單詞存進去，然後從開頭開始一個個遍歷，停止條件為當剩餘字元個數小於單詞集裡所有字元的長度。這時候我們需要定義第二個雜湊表，然後每次找出給定單詞長度的子串，看其是否在第一個雜湊表裡，如果沒有，則break，如果有，則加入第二個雜湊表，但相同的詞只能出現一次，如果多了，也break。如果正好匹配完給定單詞集裡所有的單詞，則把i存入結果中，具體參見程式碼如下：

解法一：

class Solution {
public:
    vector<int> findSubstring(string s, vector<string>& words) {
        vector<int> res;
        if (s.empty() || words.empty()) return res;
        int n = words.size(), m = words[0].size();
        unordered_map<string, int> m1;
        for (auto &a : words) ++m1[a];
        for (int i = 0; i <= (int)s.size() - n * m; ++i) {
            unordered_map<string, int> m2;
            int j = 0; 
            for (j = 0; j < n; ++j) {
                string t = s.substr(i + j * m, m);
                if (m1.find(t) == m1.end()) break;
                ++m2[t];
                if (m2[t] > m1[t]) break;
            }
            if (j == n) res.push_back(i);
        }
        return res;
    }
};

這道題還有一種O(n)時間複雜度的解法，設計思路非常巧妙，但是感覺很難想出來，博主目測還未到達這種水平。這種方法不再是一個字元一個字元的遍歷，而是一個詞一個詞的遍歷，比如根據題目中的例子，字串s的長度n為18，words陣列中有兩個單詞(cnt=2)，每個單詞的長度len均為3，那麼遍歷的順序為0，3，6，8，12，15，然後偏移一個字元1，4，7，9，13，16，然後再偏移一個字元2，5，8，10，14，17，這樣就可以把所有情況都遍歷到，我們還是先用一個雜湊表m1來記錄words裡的所有詞，然後我們從0開始遍歷，用left來記錄左邊界的位置，count表示當前已經匹配的單詞的個數。然後我們一個單詞一個單詞的遍歷，如果當前遍歷的到的單詞t在m1中存在，那麼我們將其加入另一個雜湊表m2中，如果在m2中個數小於等於m1中的個數，那麼我們count自增1，如果大於了，那麼需要做一些處理，比如下面這種情況, s = barfoofoo, words = {bar, foo, abc}, 我們給words中新加了一個abc，目的是為了遍歷到barfoo不會停止，那麼當遍歷到第二foo的時候, m2[foo]=2, 而此時m1[foo]=1，這是後已經不連續了，所以我們要移動左邊界left的位置，我們先把第一個詞t1=bar取出來，然後將m2[t1]自減1，如果此時m2[t1]<m1[t1]了，說明一個匹配沒了，那麼對應的count也要自減1，然後左邊界加上個len，這樣就可以了。如果某個時刻count和cnt相等了，說明我們成功匹配了一個位置，那麼將當前左邊界left存入結果res中，此時去掉最左邊的一個詞，同時count自減1，左邊界右移len，繼續匹配。如果我們匹配到一個不在m1中的詞，那麼說明跟前面已經斷開了，我們重置m2，count為0，左邊界left移到j+len，參見程式碼如下：

解法二：

class Solution {
public:
    vector<int> findSubstring(string s, vector<string>& words) {
        if (s.empty() || words.empty()) return {};
        vector<int> res;
        int n = s.size(), cnt = words.size(), len = words[0].size();
        unordered_map<string, int> m1;
        for (string w : words) ++m1[w];
        for (int i = 0; i < len; ++i) {
            int left = i, count = 0;
            unordered_map<string, int> m2;
            for (int j = i; j <= n - len; j += len) {
                string t = s.substr(j, len);
                if (m1.count(t)) {
                    ++m2[t];
                    if (m2[t] <= m1[t]) {
                        ++count;
                    } else {
                        while (m2[t] > m1[t]) {
                            string t1 = s.substr(left, len);
                            --m2[t1];
                            if (m2[t1] < m1[t1]) --count;
                            left += len;
                        }
                    }
                    if (count == cnt) {
                        res.push_back(left);
                        --m2[s.substr(left, len)];
                        --count;
                        left += len;
                    }
                } else {
                    m2.clear();
                    count = 0;
                    left = j + len;
                }
            }
        }
        return res;
    }
};

參考資料：

http://yucoding.blogspot.com/2013/09/leetcode-question-106-substring-with.html

https://discuss.leetcode.com/topic/6617/an-o-n-solution-with-detailed-explanation/2

http://blog.unieagle.net/2012/10/28/leetcode%E9%A2%98%E7%9B%AE%EF%BC%9Asubstring-with-concatenation-of-all-words/

LeetCode All in One 題目講解彙總(持續更新中...)

[leetcode 30 串聯所有單詞的子串 10ms]
2024-06-08
LeetCode
Leetcode Substring with Concatenation of All Words
2014-07-04
LeetCode
Substring with Concatenation of All Words
2019-05-11
Leetcode 30 Substring with Concatenation of All Words
2018-10-27
LeetCode
Leetcode-Substring with Concatenation of All Words
2014-11-26
LeetCode
Substring with Concatenation of All Words leetcode java
2014-07-28
LeetCodeJava
[LeetCode] Longest Palindromic Substring 最長迴文子串
2015-04-29
LeetCode
Leetcode5: Longest Palindromic Substring(最長迴文子串)
2019-01-12
LeetCode
【Leetcode】3. Longest Substring Without RepeatingCharacters無重最長子串
2019-05-14
LeetCodeGC
[LeetCode] Longest Substring Without Repeating Characters 最長無重複字元的子串
2015-05-06
LeetCode字元
LeetCode（1297）：子串的最大出現次數 Maximum Number of Occurrences of a Substring（Java）
2021-01-04
LeetCodeJava
Codeforces 163A Substring and Subsequence：dp【子串與子序列匹配】
2018-01-02
子串位置
2024-08-09
LeetCode3:Longest Substring Without Repeating Characters(無重複字元的最長子串)
2019-01-11
LeetCode字元
java_求列舉所有的連續(或單個字元)的子串.
2020-11-07
Java字元
子串查詢；及排列子串分析
2013-10-16
子串查詢
2020-11-04
查詢子串
2015-03-23
最長子串
2024-04-08
LeetCode——無重複字元的最長子串
2019-04-27
LeetCode字元
LeetCode 5.最長迴文子串
2019-04-14
LeetCode
Java 的字串和子串
2013-03-15
Java字串
leetcode 之無重複字元的最長子串
2020-06-03
LeetCode字元
【LeetCode】3 無重複字元的最長子串
2018-04-13
LeetCode字元
HDU 5769-Substring(字尾陣列-不相同的子串的個數)
2016-08-07
陣列
Leetcode[字串] 5. 最長迴文子串
2020-11-13
LeetCode字串
LeetCode-5. 最長迴文子串（Manacher）
2018-05-01
LeetCode
【leetcode】22. Generate Parentheses 合法括號串的所有組合
2019-05-14
LeetCode
變數子串的常用操作
2017-11-22
變數
刪除字串中的子串
2013-08-03
字串
（迴文串）leetcode各種迴文串問題
2017-02-04
LeetCode
【leetcode】【java】【3、無重複字元的最長子串】
2020-12-04
LeetCodeJava字元
leetcode-3無重複字元的最長子串
2022-03-29
LeetCode字元
LeetCode-3. 無重複字元的最長子串
2018-05-01
LeetCode字元
Leetcode 3. 無重複字元的最長子串
2024-03-12
LeetCode字元
POJ 3693 Maximum repetition substring(字尾陣列求最長重複子串)
2015-01-26
陣列
poj3080-kmp+列舉子串求最長公共子串
2014-07-14
KMP
lCS(最長公共子串)
2024-04-15

[LeetCode] Substring with Concatenation of All Words 串聯所有單詞的子串

相關文章