LeetCode Problem 3: Longest Substring Without Repeating Characters

Problem

Given a string, find the length of the longest substring without repeating characters.

Example 1

Input: "abcabcbb"
Output: 3
Explanation: The answer is "abc", with the length of 3.

Example 2

Input: "bbbbb"
Output: 1
Explanation: The answer is "b", with the length of 1.

Example 3

Input: "pwwkew"
Output: 3
Explanation: The answer is "wke", with the length of 3. Note that the answer must be a substring, "pwke" is a subsequence and not a substring.

Solution

Approach 1: Brute Force
Intuition

Check all the substring one by one to see if it has duplicate character.

Algorithm

Suppose we have a function b o o l   a l l U n i q u e ( s t r i n g   s u b s t r i n g ) bool\ allUnique(string\ substring) bool allUnique(string substring) which will return true if the characters in the substring are all unique, otherwise false. We can iterate through all the possible substrings of the given string s s s and call the function a l l U n i q u e allUnique allUnique. If it turns out to be true, then we update our answer of the maximum length of substring without duplicate characters.

Now let’s fill the missing parts:

  1. To enumerate all substrings of a given string, we enumerate the start and end indices of them. Suppose the start and end indices are i i i and j j j, respectively. Then we have 0 ≤ i < j ≤ n 0 \le i \lt j \le n 0i<jn (here end index j j j is exclusive by convention). Thus, using two nested loops with i i i from 0 0 0 to n − 1 n - 1 n1 and j j j from i + 1 i + 1 i+1 to n n n, we can enumerate all the substrings of s s s.
  2. To check if one string has duplicate characters, we can use a set. We iterate through all the characters in the string and put them into the set one by one. Before putting one character, we check if the set already contains it. If so, return false. After the loop, return true.
Code: C++
class Solution {
public:
    int lengthOfLongestSubstring(string s)
    {
        int n = s.length();
        int ans = 0;
        for (int i = 0; i < n; i++)
        {
            for (int j = i + 1; j <= n; j++)
            {
                if (allUnique(s, i, j))
                {
                    ans = max(ans, j - i);
                }
            }
        }
        return ans;
    }
private:
    bool allUnique(string s, int start, int end)
    {
        unordered_set<char> set;
        for (int i = start; i < end; i++)
        {
            char key = s[i];
            auto keyit = set.find(key);
            if (keyit != set.end())
            {
                return false;
            }
            set.insert(key);
        }
        return true;
    }
};
Complexity Analysis
  • Time complexity: O ( n 3 ) O(n^3) O(n3).
    To verify if characters within index range [ i , j ) [i, j) [i,j) are all unique, we need to scan all of them. Thus, it costs O ( j − i ) O(j - i) O(ji) time.
    For a given i i i, the sum of time costed by each j ∈ [ i + 1 , n ] j \in [i + 1, n] j[i+1,n] is ∑ j = i + 1 n O ( j − i ) \sum_{j=i+1}^{n}{O(j - i)} j=i+1nO(ji).
    Thus, the sum of all the time consumption is:
    O ( ∑ i = 0 n − 1 ( ∑ j = i + 1 n ( j − 1 ) ) = O ( ∑ i = 0 n − 1 ( n + i ) ( n − i − 1 ) 2 ) = O ( n 3 ) O(\sum_{i=0}^{n-1}{(\sum_{j=i+1}^{n}{(j-1)})} = O(\sum_{i=0}^{n-1}{\frac{(n+i)(n-i-1)}{2}}) = O(n^3) O(i=0n1(j=i+1n(j1))=O(i=0n12(n+i)(ni1))=O(n3).

  • Space complexity: O ( m i n ( n , m ) ) O(min(n, m)) O(min(n,m)).
    We need O ( k ) O(k) O(k) space for checking a substring has no duplicate characters, where k k k is the size of the set. The size of the set is upper bounded by the size of the string n n n and the size of the charset/alphabet m m m.

Approach 2: Sliding Window
Algorithm

The naive approach is very straightforward. But it is too slow. So how can we optimize it?

In the naive approaches, we repeatedly check a substring to see if it has duplicate characters. But it is unnecessary. If a substring s i j s_{ij} sij from index i i i to j − 1 j - 1 j1 is already checked to have no duplicate characters. We only need to check if s [ j ] s[j] s[j] is already in the substring s i j s_{ij} sij.

To check if a character is already in the substring, we can scan the substring, which leads to an O ( n 2 ) O(n^2) O(n2) algorithm. But we can do better.

By using HashSet as a sliding window, checking if a character in the current can be done in O(1).

A sliding window is an abstract concept commonly used in array/string problems. A window is a range of elements in the array/string which usually defined by the start and end indices, i.e. [ i , j ) [i, j) [i,j) (left-closed, right-open). A sliding window is a window “slides” its two boundaries to a certain direction. For example, if we slide [ i , j ) [i, j) [i,j) to the right by 1 element, then it becomes [ i + 1 , j + 1 ] [i +1, j +1] [i+1,j+1] (left-closed, right-open).

In our problem, we use HashSet to store the characters in current window [ i , j ) [i, j) [i,j) ( j = i j = i j=i initially). Then we slide the index j j j to the right. If it is not in the HashSet, we slide j j j further. Doing so until s [ j ] s[j] s[j] is already in the HashSet. At this point, we found the maximum size of substrings without duplicate characters starting with index i i i, If we do this for all i i i, we get our answer.

Code: C++
class Solution {
public:
    int lengthOfLongestSubstring(string s)
    {
        int n = s.length();
        int ans = 0, i = 0, j = 0;
        unordered_set<char> set;
        while (i < n && j < n)
        {
            char key = s[j];
            auto keyit = set.find(key);
            if (keyit == set.end())
            {
                set.insert(key);
                j += 1;
                ans = max(ans, j - i);
            }
            else
            {
                set.erase(s[i]);
                i += 1;
            }
        }
        return ans;
    }
};
Complexity Analysis
  • Time complexity: O ( 2 n ) = O ( n ) O(2n) = O(n) O(2n)=O(n). In the worst case each character will be visited twice by i i i and j j j.

  • Space complexity: O ( m i n ( m , n ) ) O(min(m, n)) O(min(m,n)). Same as the previous approach. We need O ( k ) O(k) O(k) space for the sliding window, where k k k is the size of the S e t Set Set. The size of the S e t Set Set is upper bounded by the size of the string n n n and the size of the charset/alphabet m m m.

Approach 3: Sliding Window Optimized

The above solution requires at most 2n steps. In fact, it could be optimized to require only n steps. Instead of using a set to tell if a character exists or not, we could define a mapping of the characters to its index. Then we can skip the characters immediately when we found a repeated character.

The reason is that if s [ j ] s[j] s[j] have a duplicate in the range [ i , j ) [i, j) [i,j) with index j ′ j' j, we don’t need to increase i i i little by little. We can skip all the elements in the range [ i , j ′ ] [i, j'] [i,j] and let i i i be j ′ + 1 j' + 1 j+1 directly.

Code: C++
class Solution 
{
public:
    int lengthOfLongestSubstring(string s)
    {
        int n = s.length(), ans = 0;
        unordered_map<char, int> map;
        for (int j = 0, i = 0; j < n; j++)
        {
            char key = s[j];
            auto keyit = map.find(key);
            if (keyit != map.end())
            {
                i = max(keyit -> second, i);
                map.erase(key);
            }
            ans = max(ans, j - i + 1);
            map.insert(make_pair(key, j + 1));
        }
        return ans;
    }
};
Complexity Analysis
  • Time complexity: O ( n ) O(n) O(n). Index j j j will iterate n n n times.

  • Space complexity: O ( m i n ( m , n ) ) O(min(m, n)) O(min(m,n)). Same as the previous approach.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值