POJ 2778-DNA Sequence(AC自動機+構建鄰接矩陣+矩陣快速冪)

kewlgrl發表於2016-08-10
DNA Sequence
Time Limit: 1000MS   Memory Limit: 65536K
Total Submissions: 15118   Accepted: 5826

Description

It's well known that DNA Sequence is a sequence only contains A, C, T and G, and it's very useful to analyze a segment of DNA Sequence,For example, if a animal's DNA sequence contains segment ATC then it may mean that the animal may have a genetic disease. Until now scientists have found several those segments, the problem is how many kinds of DNA sequences of a species don't contain those segments. 

Suppose that DNA sequences of a species is a sequence that consist of A, C, T and G,and the length of sequences is a given integer n. 

Input

First line contains two integer m (0 <= m <= 10), n (1 <= n <=2000000000). Here, m is the number of genetic disease segment, and n is the length of sequences. 

Next m lines each line contain a DNA genetic disease segment, and length of these segments is not larger than 10. 

Output

An integer, the number of DNA sequences, mod 100000.

Sample Input

4 3
AT
AC
AG
AA

Sample Output

36

Source


題目意思:

有M個含有疾病的DNA序列,求出用AGCT四種字元構成長度為N的DNA序列,使之不含有疾病序列的總數。

解題思路:

這個題阿,自己看的時候完全木有思路,也不明白怎麼就扯到矩陣上去了,搜了很多題解終於弄明白了…Orz真·弱渣渣…
推薦看這個人的!特別詳細明白!

把ac自動機看成一個有向圖,構建一個鄰接矩陣,那麼matrix[i][j]表示i和j是否可達,這個矩陣的n次冪matrix^n[i][j]表示從i恰好走n步到達j的路徑有幾條。

下面搬運一下:

這個和矩陣有什麼關係呢???
①插入字串,構建trie圖。

•上圖是例子{“ACG”,”C”},構建trie圖後如圖所示,從每個結點出發都有4條邊(A,T,C,G)
•從狀態0出發走一步有4種走法:
  –走A到狀態1(安全);
  –走C到狀態4(危險);
  –走T到狀態0(安全);
  –走G到狀態0(安全);
•所以當n=1時,答案就是3
•當n=2時,就是從狀態0出發走2步,就形成一個長度為2的字串,只要路徑上沒有經過危險結點,有幾種走法,那麼答案就是幾種。依此類推走n步就形成長度為n的字串。
②建立trie圖的鄰接矩陣M:

2 1 0 0 1

2 1 1 0 0

1 1 0 1 1

2 1 0 0 1

2 1 0 0 1

M[i,j]表示從結點i到j只走一步有幾種走法。

那麼M的n次冪就表示從結點i到j走n步有幾種走法。

去掉危險結點,也就是去掉危險結點的行和列。結點3和4是單詞結尾所以危險,結點2的fail指標指向4,當匹配”AC”時也就匹配了”C”,所以2也是危險的。

矩陣變成M:

2 1

2 1

④計算M[][]的n次冪,然後 Σ(M[0,i]) mod 100000 就是答案。

由於n很大,可以使用二分來計算矩陣的冪


下面的程式碼是kuangbin巨巨的~

#include <iostream>
#include <stdio.h>
#include <algorithm>
#include <string.h>
#include <queue>
using namespace std;

const int MOD=100000;
struct Matrix
{
    int mat[110][110],n;
    Matrix() {}
    Matrix(int _n)
    {
        n = _n;
        for(int i=0; i<n; i++)
            for(int j=0; j<n; j++)
                mat[i][j]=0;
    }
    Matrix operator *(const Matrix &b)const
    {
        Matrix ret=Matrix(n);
        for(int i=0; i<n; i++)
            for(int j=0; j<n; j++)
                for(int k=0; k<n; k++)
                {
                    int tmp=(long long)mat[i][k]*b.mat[k][j]%MOD;
                    ret.mat[i][j]=(ret.mat[i][j]+tmp)%MOD;
                }
        return ret;
    }
};
struct Trie
{
    int next[110][4],fail[110];
    bool end[110];
    int root,L;
    int newnode()
    {
        for(int i=0; i<4; i++)
            next[L][i]=-1;
        end[L++]=false;
        return L-1;
    }
    void init()
    {
        L=0;
        root=newnode();
    }
    int getch(char ch)
    {
        switch(ch)
        {
        case 'A':
            return 0;
            break;
        case 'C':
            return 1;
            break;
        case 'G':
            return 2;
            break;
        case 'T':
            return 3;
            break;
        }
    }
    void insert(char s[])
    {
        int len=strlen(s);
        int now=root;
        for(int i = 0; i < len; i++)
        {
            if(next[now][getch(s[i])] == -1)
                next[now][getch(s[i])] = newnode();
            now = next[now][getch(s[i])];
        }
        end[now]=true;
    }
    void build()
    {
        queue<int>Q;
        for(int i = 0; i < 4; i++)
            if(next[root][i] == -1)
                next[root][i] = root;
            else
            {
                fail[next[root][i]] = root;
                Q.push(next[root][i]);
            }
        while(!Q.empty())
        {
            int now = Q.front();
            Q.pop();
            if(end[fail[now]]==true)
                end[now]=true;
            for(int i = 0; i < 4; i++)
            {
                if(next[now][i] == -1)
                    next[now][i] = next[fail[now]][i];
                else
                {
                    fail[next[now][i]] = next[fail[now]][i];
                    Q.push(next[now][i]);
                }
            }
        }
    }
    Matrix getMatrix()
    {
        Matrix res = Matrix(L);
        for(int i=0; i<L; i++)
            for(int j=0; j<4; j++)
                if(end[next[i][j]]==false)
                    res.mat[i][next[i][j]]++;
        return res;
    }
};

Trie ac;
char buf[20];

Matrix pow_M(Matrix a,int n)
{
    Matrix ret = Matrix(a.n);
    for(int i = 0; i < ret.n; i++)
        ret.mat[i][i]=1;
    Matrix tmp=a;
    while(n)
    {
        if(n&1)ret=ret*tmp;
        tmp=tmp*tmp;
        n>>=1;
    }
    return ret;
}

int main()
{
    int n,m;
    while(scanf("%d%d",&n,&m) != EOF)
    {
        ac.init();
        for(int i=0; i<n; i++)
        {
            scanf("%s",buf);
            ac.insert(buf);
        }
        ac.build();//插入字串構建AC自動機,根據trie圖構建鄰接矩陣
        Matrix a=ac.getMatrix();//從矩陣中去掉含疾病的危險節點所在行列
        a=pow_M(a,m);
        int ans=0;
        for(int i=0; i<a.n; i++)
        {
            ans=(ans+a.mat[0][i])%MOD;
        }
        printf("%d\n",ans);
    }
    return 0;
}


相關文章