Sunday, August 25, 2019

Leetcode 828: Unique Letter String

https://leetcode.com/problems/unique-letter-string/description/

A character is unique in string S if it occurs exactly once in it.
For example, in string S = "LETTER", the only unique characters are "L" and "R".
Let's define UNIQ(S) as the number of unique characters in string S.
For example, UNIQ("LETTER") =  2.
Given a string S with only uppercases, calculate the sum of UNIQ(substring) over all non-empty substrings of S.
If there are two or more equal substrings at different positions in S, we consider them different.
Since the answer can be very large, return the answer modulo 10 ^ 9 + 7.

Example 1:
Input: "ABC"
Output: 10
Explanation: All possible substrings are: "A","B","C","AB","BC" and "ABC".
Evey substring is composed with only unique letters.
Sum of lengths of all substring is 1 + 1 + 1 + 2 + 2 + 3 = 10
Example 2:
Input: "ABA"
Output: 8
Explanation: The same as example 1, except uni("ABA") = 1.

Note: 0 <= S.length <= 10000.
Notes:

The key is to find rules. This is another question about substring or subarray. So dp will be the first choice.

But this question asking for the number of unique letter in all the substrings. The native way maybe check all the substrings which are N(N+1)/2 in total, and we need to run through each substring as well. So the total time complexity is O(N3).

But actually we do not need to check every substrings. For each new letter  with index i adding in, we will have i more substrings ending with the ith element. For all these substrings, if we know the previous position of the ith element, let say k, then the ith element will be unique element for substrings starting with elements after the kth and ending with the ith. And the total number is (i-k).

Since we only add the ith letter into all the previous substrings ending with the (i-1)th letter, all the rest letters are still the same as the previous case, which means we just can re-use the previous results without calculated again. Meanwhile, we need to remove all the count for the ith letter before the kth position. Since the ith letter for all the previous ones are no longer unique: we will have at least two if we add the ith letter in (the ith letter is the same as the kth letter, and i > k). So we need to memorize the results for the kth letter too.

There are some details for initialization.

(It is interesting that no modulo is used, maybe the test cases are too small...)

See the code below:

class Solution {
public:
    int uniqueLetterString(string S) {
        int res = 0, preSum = 0, curV = 0, curSum = 0;
        vector<int> preP(26, -1), preV(26, 0);
                            //preP: the previous position of each kind of letter
                            //preV: the previous unique counting for each kind of letter
        for(int i=0; i<S.size(); ++i) {
            int id = S[i] - 'A';
            curV = i - preP[id];
            curSum = preSum + curV - preV[id];
            res += curSum;
            preP[id] = i;
            preV[id] = curV;
            preSum = curSum;
        }
        return res;
    }
};

No comments:

Post a Comment