Remember the Word
Neal is very curious about combinatorial problems, and now here comes a problem about words. Knowing that Ray has a photographic memory and this may not trouble him, Neal gives it to Jiejie.
Since Jiejie can't remember numbers clearly, he just uses sticks to help himself. Allowing for Jiejie's only 20071027 sticks, he can only record the remainders of the numbers divided by total amount of sticks.
The problem is as follows: a word needs to be divided into small pieces in such a way that each piece is from some given set of words. Given a word and the set of words, Jiejie should calculate the number of ways the given word can be divided, using the words in the set.
Input
The input file contains multiple test cases. For each test case: the first line contains the given word whose length is no more than 300 000.
The second line contains an integer S <tex2html_verbatim_mark>, 1S4000 <tex2html_verbatim_mark>.
Each of the following S <tex2html_verbatim_mark>lines contains one word from the set. Each word will be at most 100 characters long. There will be no two identical words and all letters in the words will be lowercase.
There is a blank line between consecutive test cases.
You should proceed to the end of file.
Output
For each test case, output the number, as described above, from the task description modulo 20071027.
Sample Input
abcd 4 a b cd ab
Sample Output
Case 1: 2
題目大意:背單詞。給出一個由S個不同單片語成的字典和一個長字串。把這個字串分解成若干個單詞的連線(單詞可以重複使用),有多少種方法?比如,有4個單詞a,b,cd,ab,則abcd有兩種分解方法:a+b+cd和ab+cd.
分析:不難想到這樣的遞推法:令d(i)表示從字元i開始的字串(即字尾S[i..L])的分解方案數,則d(i)=sum{d(i+len(x))|單詞x是S[i..L]的字首}。
如果先列舉x,再判斷它是否為S[i..L]的字首,時間無法承受(最多可能有4000個單詞,判斷還需要一定的時間)。可以換一個思路,先把所有單片語織成Trie,然後試著在Trie中“查詢”S[i..L]。查詢過程中每經過一個單詞結點,就找到一個上述狀態轉移方程中的x,最多隻需要比較100次就能能找到所有的x。
程式碼如下:
1 #include<cstring> 2 #include<vector> 3 using namespace std; 4 5 const int maxnode = 4000 * 100 + 10; 6 const int sigma_size = 26; 7 8 // 字母表為全體小寫字母的Trie 9 struct Trie { 10 int ch[maxnode][sigma_size]; 11 int val[maxnode]; 12 int sz; // 結點總數 13 void clear() { sz = 1; memset(ch[0], 0, sizeof(ch[0])); } // 初始時只有一個根結點 14 int idx(char c) { return c - 'a'; } // 字元c的編號 15 16 // 插入字串s,附加資訊為v。注意v必須非0,因為0代表“本結點不是單詞結點” 17 void insert(const char *s, int v) { 18 int u = 0, n = strlen(s); 19 for(int i = 0; i < n; i++) { 20 int c = idx(s[i]); 21 if(!ch[u][c]) { // 結點不存在 22 memset(ch[sz], 0, sizeof(ch[sz])); 23 val[sz] = 0; // 中間結點的附加資訊為0 24 ch[u][c] = sz++; // 新建結點 25 } 26 u = ch[u][c]; // 往下走 27 } 28 val[u] = v; // 字串的最後一個字元的附加資訊為v 29 } 30 31 // 找字串s的長度不超過len的字首 32 void find_prefixes(const char *s, int len, vector<int>& ans) { 33 int u = 0; 34 for(int i = 0; i < len; i++) { 35 if(s[i] == '\0') break; 36 int c = idx(s[i]); 37 if(!ch[u][c]) break; 38 u = ch[u][c]; 39 if(val[u] != 0) ans.push_back(val[u]); // 找到一個字首 40 } 41 } 42 }; 43 44 #include<cstdio> 45 const int maxl = 300000 + 10; // 文字串最大長度 46 const int maxw = 4000 + 10; // 單詞最大個數 47 const int maxwl = 100 + 10; // 每個單詞最大長度 48 const int MOD = 20071027; 49 50 int d[maxl], len[maxw], S; 51 char text[maxl], word[maxwl]; 52 Trie trie; 53 54 int main() { 55 int kase = 1; 56 while(scanf("%s%d", text, &S) == 2) { 57 trie.clear(); 58 for(int i = 1; i <= S; i++) { 59 scanf("%s", word); 60 len[i] = strlen(word); 61 trie.insert(word, i); 62 } 63 memset(d, 0, sizeof(d)); 64 int L = strlen(text); 65 d[L] = 1; 66 for(int i = L-1; i >= 0; i--) { 67 vector<int> p; 68 trie.find_prefixes(text+i, L-i, p); 69 for(int j = 0; j < p.size(); j++) 70 d[i] = (d[i] + d[i+len[p[j]]]) % MOD; 71 } 72 printf("Case %d: %d\n", kase++, d[0]); 73 } 74 return 0; 75 }