Saturday, August 17, 2019

Leetcode 395: Longest Substring with At Least K Repeating Characters

https://leetcode.com/problems/longest-substring-with-at-least-k-repeating-characters/description/


Solution 1: Brute Force

If we fix the left and gradually extend the right, then it is easy to prove that there are N + (N-1) + ... + 2 + 1 = N(N+1)/2 substrings. For each substring, we can scan them first and then determine whether it is valid or not. If yes, record the length for comparison, and save it if it is longer.

For determination whether each substring is valid or not, we can use a num_unique variable to record the counting situations. When a new letter is introduced, we update the num_unique by adding 1; when the count of one letter reaches to k, we update the num_unique by reducing 1. Thus, when num_unique = 0, it means all the letters satisfy the requirements: the counting is no less than k.

See the code below:

class Solution {
public:
    int longestSubstring(string s, int k) {
        int res = 0, len = s.size();
        if(s.size() < k) return res;
        if(k==1) return len;
        for(int left = 0; left < len; ++left) {
            vector ct(26, 0);
            int num_unique = 0;
            for(int right = left; right < len; ++right) {
                int idx = s[right] - 'a';
                ++ct[idx];
                if(ct[idx] == 1) ++num_unique;//add a new unique letter;
                if(ct[idx] == k) --num_unique;//reach k counts;
                if(num_unique == 0) res = max(res, right - left + 1);//when all letters reach or more than k counts;
            }
        }
        return res;
    }
};

So the time complexity is O(N^2) and space complexity is O(26) or O(1), are there better solutions in time complexity?

Solution 2: Recursion

Another way to think about this question is also straightforward: if we know the total number of one kind of letter is less than k for the whole string, then we can break the string into smaller strings at the positions of that kind of letter. This is one typical method in programming: break a question into a smaller similar questions, which is also called recursion.

See the code below:

class Solution {
public:
    int longestSubstring(string s, int k) {
        if(s.size() < k) return 0;
        vector<int> ct(26, 0);
        for(auto &a : s) ++ct[a-'a'];
        int idx = 0;
        while(idx < s.size() && ct[s[idx]-'a'] && ct[s[idx]-'a'] >= k) ++idx;
        if(idx == s.size()) return s.size();
        int left = longestSubstring(s.substr(0, idx), k);
        int right = longestSubstring(s.substr(idx+1), k);
        return max(left, right);
    }
};

The time complexity of the above code is still O(N^2). And there is a O(N) solution, which can be found below.

Solution 3: Categorization and Counting

There is one additional implicit constrain to the questions: there are only at most 26 kinds of letter in total. So for the valid substrings, they must satisfy:

(1) the number of each kind of letter is larger or equal to k;

(2) the total kinds of letter is between 1 and 26.

So we can scan the string with the above two constrains using two-pointer method. Basically, we find all the substrings that fit the above two constrains, then then find the one with the longest length.

See the code below:

class Solution {
public:
    int longestSubstring(string s, int k) {
        if(s.size() < k) return 0;
        int res = 0;
        for(int h=1; h<=26; ++h) {
            int i=0, j=0, unique = 0, noLessK = 0;
            vector<int> cts(26, 0);
            while(j<s.size()) {
             //   cout<<j<<endl;
                int b = s[j] - 'a';//move right idx and count
                if(cts[b] == 0) ++unique;
                ++cts[b];
                if(cts[b] == k) ++noLessK;
                if(unique > h) {
                    while(unique > h) {//move left idx to decrease total kinds of letter
                        int a = s[i] - 'a';   
                        if(cts[a] == k) --noLessK;
                        --cts[a];
                        if(cts[a] == 0) --unique;
                        ++i;
                    }
                }
                if(unique == noLessK) res = max(res, j-i+1);//satisfy two constrains
                ++j;
            }
        }
        return res;
    }
};

The time complexity is O(26*N) or O(N).

Follow ups or similar questions:

(1): what is the longest substring that fits the question?

(2): what is the total number of such substring?

No comments:

Post a Comment