HDU 4668 Finding string (解析字串 + KMP)

acm_cxlove發表於2013-08-29

字串KMP

轉載請註明出處，謝謝http://blog.csdn.net/ACM_cxlove?viewmode=contents by---cxlove

題意：給出一個壓縮後的串，以及一個模式串，問模式串出現了多少次。

http://acm.hdu.edu.cn/showproblem.php?pid=4668

這種壓縮形式的話，在去年金華邀請賽中出現過，但是那題的範圍不大。

直接展開作多串匹配，暴力AC自動機就行。

但是這題的原串不大，但是展開後會非常大。

可以發現壓縮串把原串分為一個個區間，那麼我們可以分兩步統計。

預處理的話需要將壓縮串解析成一個個的區間，在這裡為了方便後面的匹配，我們假設模式串的長度為P

我在解析的時候，如果相鄰兩個區間的字串長度都小於p的話，會將其合併，作用在後面說。

對於每一個區間，查詢匹配次數。

1、如果這個區間非壓縮的，直接KMP

2、如果這個區間是壓縮後的，肯定不能展開暴力KMP。我們只需要統計匹配的起始位置在第一個迴圈節內的，因為我們在第一個迴圈節後面新增上p - 1個字元。這樣保證了匹配位置在第一個迴圈節內，然後便是統計。

這裡有點麻煩的是，[ab]10中查詢ab的話，我們應該是aba中查詢ab，發現出現了1次，那麼最終結果應該統計為多少呢，按理說，aba這個串佔用了兩個迴圈節，那麼總共有9個aba，應該是9次，但是顯然這裡應該是10次。那麼對於[ab]10裡面查詢ba的話，同樣還是把aba拿出來匹配，出現了1次，那麼這裡應該是9次。看似類似的情況，統計結果卻不一樣，因為第一次匹配只佔用本身這個迴圈節，而第二次佔用了所有迴圈節。

那麼我們在後面新增 p - 1個字元之後，同樣是處理kmp，同樣是考慮匹配位置佔用了所有的迴圈節統計，然後再新增上不佔用最後一個迴圈節次數。如[ab]10查詢ab，aba中出現一次ab，而[ab]10中有9個aba，所以出現9次，然後再統計ab中有多少個ab，答案為1,所以最終為10。
注意有些地方的表述，最好用一個較長的串再模擬一下，如[ab]10和[ba]10中查詢abab這個串的情況。

第一個問題就算解決了

第二個問題是跨區間的串，即模式串出現在兩個或者兩個以上的區間中。

既然如此，要保證這個匹配串要橫跨到下一個區間，那麼當前區間取一個長度為p - 1的字尾，那麼匹配的話肯定會到下一個區間，下一個區間的話就取一個p - 1的字首，那麼保證匹配的串會出現在第一個區間中。

雖然我在前面解析的過程中，保證相鄰兩個區間長度小於p的話，會合並。

但是有一種情況是[ab]1000000cd[ab]10000000。那麼對於中間的區間長度還是小於p。

所以就有可能匹配串橫跨三個區間，這裡需要特判一下，即第一個區間不超過p - 1，第二個區間全取，第三個區間加上第二個區間的長度要不超過p - 1。

總之就是這麼麻煩。。。。應該是我實現得太麻煩。。。

不過唯一的好處便是範圍不大，我是用String各種亂搞的。。。

#include <iostream>
#include <cstdio>
#include <cstring>
#include <map>
#include <vector>
#include <string>
#include <queue>
#include <cmath>
#include <algorithm>
#define lson step << 1
#define rson step << 1 | 1
#pragma comment(linker,"/STACK:102400000,102400000")
using namespace std;
typedef long long LL;
const int N = 5005;
struct Node {
    string s;
    int cnt;
    Node () {}
    Node (string _s , int c) :s(_s) , cnt(c) {}
    string cat () {
        string t = s;
        for (int i = 1 ; i < cnt ; i ++)
            s = s + t;
        cnt = 1;
        return s;
    }
    LL len () {
        return (LL)s.size() * cnt;
    }
    //
    string prefix (int l) {
        string str = s;
        for (int i = 1 ; i < cnt && str.size() < l ; i ++) {
            str += s;
        }
        return str.substr (0 , l);
    }
    string suffix (int l) {
        string str = s;
        for (int i = 1 ; i < cnt && str.size() < l ; i ++) {
            str += s;
        }
        return str.substr (str.size() - l , l);
    }
}a[N];
char str[N] , pat[N];
int next[N] , idx , l , p;
void get_next (char *s , int l) {
    next[0] = -1;
    int i = 0 , j = -1;
    while (i < l) {
        if (j == -1 || s[i] == s[j]) {
            i ++; j ++;
            next[i] = j;
        }
        else j = next[j];
    }
}
void gao (string s , int tot) {
    if (s == "") return ;
    if (idx == 0 || s.size() * tot >= p || a[idx - 1].len() >= p) {
        a[idx ++] = Node (s , tot);
    }
    else {
        a[idx - 1].cat ();
        a[idx - 1].s += Node (s , tot).cat();
    }
} 
int match (string s , char *t , int p) {
    int l = s.size() ;
    int i = 0 , j = 0 , ans = 0;
    while (i < s.size()) {
        if (j == - 1 || s[i] == t[j]) {
            i ++; j ++;
            if (j == p) {
                ans ++;
                j = next[j];
            }
        }
        else j = next[j];
    }
    return ans;
} 
int main () {
    #ifndef ONLINE_JUDGE
        freopen ("input.txt" , "r" , stdin);
        freopen ("output.txt" , "w" , stdout);
    #endif
    while (scanf ("%s %s" , str , pat) != EOF) {
        idx = 0;
        l = strlen (str);p = strlen (pat);
        get_next (pat , p);
        string s = "";
        int tot = 1;
        for (int i = 0 ; i < l ; i ++) {
            if (str[i] == '[') {
                if (s == "") continue;
                gao (s , tot);
                s = ""; tot = 1;
            }
            else if (str[i] == ']') {
                tot = 0;
                i ++;
                while (isdigit(str[i]))
                    tot = tot * 10 + str[i ++] - '0';
                i --;
                gao (s , tot);
                s = ""; tot = 1;
            }
            else s += str[i];
        }
        gao (s , tot);
        s = ""; tot = 1;
        LL ans = 0;
        // for (int i = 0 ; i < idx ; i ++) {
        //     cout << a[i].s << " " << a[i].cnt << endl; 
        // }
        for (int i = 0 ; i < idx ; i ++) {
            if (a[i].len() < p) continue;
            if (a[i].cnt == 1) ans += match (a[i].s , pat , p);
            else {
                int use = min(a[i].cnt , 1 + (p - 1 + (int)a[i].s.size() - 1) / (int)a[i].s.size());
                string s = "";
                for (int j = 1 ; j < use ; j ++) {
                    s += a[i].s;
                }
                s = a[i].s + s.substr (0 , min ((int)s.size() , p - 1));
                int tmp = match (s , pat , p);
                ans += (LL)tmp * (a[i].cnt - use + 1);
                if (p) {
                    s = "";
                    for (int j = 1 ; j < use ; j ++)
                        s += a[i].s;
                    ans += match (s , pat , p);
                }
            }
        }
        for (int i = 0 ; i < idx - 1 ; i ++) {
            s = a[i].suffix (min (a[i].len () , p - 1LL));
            if (a[i + 1].len () < p - 1) {
                s += a[i + 1].cat ();
                if (i + 2 < idx) {
                    s += a[i + 2].prefix (min (a[i + 2].len () , p - 1 - a[i + 2].len ()));
                }
            }
            else {
                s += a[i + 1].prefix (min (a[i + 1].len () , p - 1LL));
            }
            ans += match (s , pat , p);
        }
        printf ("%I64d\n" , ans);
    }
    return 0;
}

【KMP求字串匹配次數】 hdu 1686
2017-07-22
KMP字串匹配
【KMP求字串第一個匹配位置】hdu 1711
2017-07-22
KMP字串
HDU 2594 (KMP入門)
2015-08-24
KMP
【字串匹配】KMP
2024-08-28
字串匹配KMP
HDU 1711 Number Sequence(KMP)
2020-04-06
KMP
HDU 2203(KMP) 親和串
2015-08-24
KMP
manacher || 擴充套件kmp -- Best Reward HDU - 3613
2020-12-11
套件KMP
KMP Algorithm 字串匹配演算法KMP小結
2017-06-12
KMPGo字串匹配演算法
KMP字串模式匹配詳解
2020-04-07
KMP字串模式
【模板】【字串】KMP演算法
2017-02-23
字串KMP演算法
hdu5414（2015多校10）--CRB and String（字串匹配）
2017-06-13
字串匹配
String字串
2020-12-24
字串
string 字串
2024-06-07
字串
KMP字串匹配學習筆記
2021-04-08
KMP字串匹配筆記
字串匹配演算法：KMP
2023-11-04
字串匹配演算法KMP
KMP字串匹配演算法
2023-10-01
KMP字串匹配演算法
字串匹配KMP演算法初探
2012-09-05
字串匹配KMP演算法
JavaScript String 字串
2018-08-27
JavaScript字串
字串學習總結（Hash & Manacher & KMP）
2020-07-21
字串KMP
字串匹配之KMP《演算法很美》
2021-01-04
字串匹配KMP演算法
字串匹配問題——KMP演算法
2018-03-23
字串匹配KMP演算法
JavaScript 字串(String) 大全
2019-08-05
JavaScript字串
Java-string字串
2018-12-17
Java字串
String：字串型別
2020-10-29
字串型別
String字串，陣列
2020-10-16
字串陣列
String與字串池
2011-08-27
字串
C# 字串（String）
2024-06-14
C#字串
kmp字串匹配，A星尋路演算法
2018-09-21
KMP字串匹配演算法
字串匹配基礎下——KMP 演算法
2018-12-11
字串匹配KMP演算法
字串演算法--$\mathcal{KMP，Trie}$樹
2023-03-29
字串演算法KMP
POJ 1261 Period KMP （字串週期）
2016-08-06
KMP字串
KMP字串匹配演算法通俗理解
2009-06-12
KMP字串匹配演算法
【演算法】KMP演算法解析
2013-10-29
演算法KMP
第五章字串專題 ---------------- 字串匹配(二)----KMP演算法
2019-03-19
字串匹配KMP演算法
HDU 2017 字串統計
2014-07-26
字串
Redis命令String（字串）教程
2018-02-07
Redis字串
string，字串使用指南
2024-11-18
字串
快速字串匹配一: 看毛片演算法（KMP）
2019-08-05
字串匹配演算法KMP

HDU 4668 Finding string (解析字串 + KMP)

相關文章