[CareerCup] 1.5 Compress String 壓縮字串

Grandyang發表於2015-07-17

 

1.5 Implement a method to perform basic string compression using the counts of repeated characters. For example, the string aabcccccaaa would become a2blc5a3. If the "compressed" string would not become smaller than the original string, your method should return the original string.

 

這道題讓我們壓縮給定的字串,壓縮方法是對於重複的字元,用數字來表示重複的個數,這種壓縮方法對於有很多重複字元具有很高的壓縮效率,但是對於不重複的字串,壓縮後的表示方法反而比不壓縮佔空間大。所以我們首先要先來計算下壓縮後的字串的長度,和原字串長度比較,如果大的話,則直接返回原字串,如果小的話,則我們就開始壓縮。那麼我們需要建立一個新字串來儲存壓縮後的字串,這裡書中特別提到了用字串的相加的方法是很沒有效率的,下面英文部分摘自Cracking the coding interview 5th edition的第72頁:

 

Imagine you were concatenating a list of strings, as shown below. What would the running time of this code be? For simplicity, assume that the strings are all the same length (call this x) and that there are n strings.

public String joinWords(String[] words) {
    String sentence = "";
    for (String w : words) {
        sentence = sentence + w;
    }
    return sentence;
}

On each concatenation, a new copy of the string is created, and the two strings are copied over, character by character. The first iteration requires us to copy x characters. The second iteration requires copying 2x characters.The third iteration requires 3x, and
so on.The total time therefore is 0(x + 2x + ... + nx). This reduces to 0(xn2). (Why isn't it 0(xnn)? Because 1 + 2 + ... + nequals n(n+l)/2,orO(n2).)

 

根據上面所說,字串的拼接餘姚拷貝拼接的兩個字串,當字串長度很長的時候,這種方法非常沒有效率,所以我們要避免使用拼接的方法。那麼我們的替代方法是先宣告好一個定長的字串,然後給每個位置賦值。壓縮後的字串長度我們開始計算過了,所以只需要給每個位置賦值即可,跟之前那道1.4 Replace Spaces 替換空格有些相似,參見程式碼如下:

 

class Solution {
public:
    string compress(string s) {
        int newLen = countCompression(s);
        if (newLen >= s.size()) return s;string res(newLen, ' ');
        char c = s[0];
        int cnt = 1, idx = 0;
        for (int i = 1; i < s.size(); ++i) {
            if (s[i] == c) ++cnt;
            else {
                res[idx++] = c;
                for (auto a : to_string(cnt)) res[idx++] = a;
                c = s[i];
                cnt = 1;
            }
        }
        res[idx++] = c;
        for (auto a : to_string(cnt)) res[idx++] = a;
        return res;
    }
    int countCompression(string s) {
        if (s.empty()) return 0;
        int res = 0, cnt = 1;
        char c = s[0];
        for (int i = 1; i < s.size(); ++i) {
            if (s[i] == c) ++cnt;
            else {
                c = s[i];
                res += 1 + to_string(cnt).size();
                cnt = 1;
            }
        }
        res += 1 + to_string(cnt).size();
        return res;
    }
};

 

相關文章