group_001_group001是-优快云博客

本文链接：https://blog.youkuaiyun.com/weixin_45595437/article/details/112554101

博客围绕LeetCode的两道题展开。“Two Sum”题介绍了暴力遍历解法及以空间换时间的优化解法，给出了时间和空间复杂度；“Longest Substring Without Repeating Characters”题讲解了Python中相关函数应用，还提及代码优化及更新关键值时“偏移量”概念。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

文章目录

- - 1. Two Sum
  - 3. Longest Substring Without Repeating Characters

1. Two Sum

这道题最直观的解法就是暴力遍历，从 nums 的第一个元素开始试，对每一个元素，都在后面剩余的元素中进行查找，看能不能找到和它相加等于 target 的元素。如果能找到，那就说明数组中存在两个元素相加等于 target ；如果遍历完所有的元素之后都不能找到，那就说明数组中不存在这样的两个元素。

class Solution(object):
    def twoSum(self, nums, target):
        """
        :type nums: List[int]
        :type target: int
        :rtype: List[int]
        """
        n = len(nums)
        for i in range(n):
            for j in range(i+1, n): # 因为不能重复用，所以要从i的后一位开始
                if nums[i] + nums[j] == target:
                    return [i, j]

这种暴力解法在 LeetCode 中竟然也能通过！

在这里插入图片描述

但是这种解法显然不是最优的。为了能够在速度上进行提升，我们需要以空间换时间。

为此，我们需要额外申请一个 dict ，这个 dict 中记录了和 nums[i] 互补的元素以及它在原始的 nums 数组中的索引[1]，即：
$table=\{target-nums[i]:i\}$
其中 key 为 target-nums[i] ，value 为 i 。

这样的话，每遍历到一个元素，我们就在 table 中找有没有当前元素的互补元素。如果有，那就说明找到了这样的两个数；如果没有，那就把当前元素以键值对的形式添加到 table 字典中。由于题目中说 “一定存在一组解”，所以当我们遍历完整个 nums 后，一定可以找到这样的一组解。

这种解法在最坏情况下的时间复杂度是 $O (n)$ ，对应着从第一个元素一直遍历到最后一个元素的情形。在最坏情况下的空间复杂度也是 $O (n)$ ，依旧对应着从第一个元素一直遍历到最后一个元素的情形。

代码如下：

class Solution(object):
    def twoSum(self, nums, target):
        """
        :type nums: List[int]
        :type target: int
        :rtype: List[int]
        """
        table = {}
        for i, num in enumerate(nums):
            comp = target - num
            if comp in table:
                return [table[comp], i]
            else:
                table[num] = i # 这个地方存的应该是num的索引，而不是comp的索引

通过情况如下：

在这里插入图片描述

3. Longest Substring Without Repeating Characters

在 Python 中，使用 list.index(value) 可以返回 value 在 list 中第一次出现的位置。如：

x = ['a', 'b', 'c', 'a', 'b', 'c', 'b', 'b']

ind1 = x.index('c')
print(f'first index = {ind1}') # 查找'c'第一次出现的位置

ind2 = x[::-1].index('c')
n = len(x)
print(f'last index = {n - ind2 - 1}') # 查找'c'第二次出现的位置。注意这里需要额外减1.

"""Results:
first index = 2
last index = 5
"""

能通过的代码版本：（V1.0）

class Solution(object):
    def lengthOfLongestSubstring(self, s):
        """
        :type s: str
        :rtype: int
        """
        last_index = 0
        res = 0
        for i, c in enumerate(s):
            if c in s[last_index : i]:
                # 如果当前位置的元素在前面一个子串中出现过，
                # 那么要寻找这个元素在前面一个子串中最后一次出现的位置。
                new_index = s[last_index : i].index(c)
                # 找到这个位置以后，要更新last_index的值：
                last_index += new_index + 1 # 这个地方为什么还要加上原来的last_index呢？是因为
                                            # 有一个“偏移量”的概念在这里边。（见后面的分析）
                # 然后重新计算最长的子串长度：
                res = max(res, len(s[last_index : i+1])) # 这里是从新的last_index开始，包含当前的i
            else:
                # 如果当前位置的元素没有在前面一个子串中出现过，
                # 那么就直接把当前i位置的元素包含进来，
                # 然后更新最长子串的长度。
                res = max(res, len(s[last_index : i+1])) # i+1的目的是为了包含当前i位置上的元素
        
        return res

进一步优化（V2.0）：

也就是说，我们只需要不停地去更新 last_index 的位置就可以了，所以上述的代码可以进行进一步地精简：

class Solution(object):
    def lengthOfLongestSubstring(self, s):
        """
        :type s: str
        :rtype: int
        """
        last_index = 0
        res = 0
        for i, c in enumerate(s):
            # 我们只需要不停地去更新last_index的位置就可以了
            if c in s[last_index : i]:
                # 如果当前位置的元素在前面一个子串中出现过，
                # 那么要寻找这个元素在前面一个子串中最后一次出现的位置。
                # 可以证明，这里c第一次出现的位置就是最后一次出现的位置。
                new_index = s[last_index : i].index(c)
                # 找到这个位置以后，要更新last_index的值：
                last_index += new_index + 1 # 这个地方为什么还要加上原来的last_index呢？是因为
                                            # 有一个“偏移量”的概念在这里边。（见后面的分析）
            # 然后重新计算最长的子串长度：
            res = max(res, len(s[last_index : i+1])) # 这里是从新的last_index开始，
                                                     # i+1是为了包含当前i位置上的元素
            
        return res

这里面，最关键的一步就是更新 last_index 的值。假设对一个字符串 s = "abcabcbb" 而言，如果我们仅仅让 last_index = new_index +1 ，那么就会出现这样的情况（调试代码见后面的补充材料 A）：

c not in cur_sub, res = 1, cur_sub = a
c not in cur_sub, res = 2, cur_sub = ab
c not in cur_sub, res = 3, cur_sub = abc
c in cur_sub, res = 3, cur_sub = bca, last_index = 1
c in cur_sub, res = 4, cur_sub = bcab, last_index = 1
c in cur_sub, res = 4, cur_sub = cabc, last_index = 2
c in cur_sub, res = 4, cur_sub = abcb, last_index = 3
c in cur_sub, res = 6, cur_sub = cabcbb, last_index = 2

res = 6

这是因为，s[last_index : i].index(c) 每次在寻找 c 第一次出现的位置时，都是从 s 的一个子串中寻找的，比如 "abc" 、"bca" ，算法并不知道在 "bca" 的前面是否还存在其他的字符。所以每次找到的索引值都是从子串 "abc" 、"bca" 的起始位置开始的，而不是从整个 s 字符串的起始位置开始的。所以在更新 last_index 的值的时候，必须要想办法把子串前面的那些字符也算进去，这也就是我们提到的 ”偏移量“ 的概念。即：

last_index += new_index + 1

在加上偏移量之后，再次 debug 得到的结果如下所示：

c not in cur_sub, res = 1, cur_sub = a
c not in cur_sub, res = 2, cur_sub = ab
c not in cur_sub, res = 3, cur_sub = abc
c in cur_sub, res = 3, cur_sub = bca, last_index = 1
c in cur_sub, res = 3, cur_sub = cab, last_index = 2
c in cur_sub, res = 3, cur_sub = abc, last_index = 3
c in cur_sub, res = 3, cur_sub = cb, last_index = 5
c in cur_sub, res = 3, cur_sub = b, last_index = 7

res = 3

这样代码就能够输出正确的结果，也能够被 LeetCode AC。

References:

[1] Nathan_Fegard and joeg. LeetCode Discuss. https://leetcode.com/problems/two-sum/discuss/17/Here-is-a-Python-solution-in-O(n)-time

Supplements:

A. 第 3 题中的调试代码：

class Solution(object):
    def lengthOfLongestSubstring(self, s):
        """
        :type s: str
        :rtype: int
        """
        last_index = 0
        res = 0
        for i, c in enumerate(s):
            if c in s[last_index : i]:
                # 如果当前位置的元素在前面一个子串中出现过，
                # 那么要寻找这个元素在前面一个子串中最后一次出现的位置。
                new_index = s[last_index : i].index(c)
                # 找到这个位置以后，要更新last_index的值：
                # last_index = new_index + 1 # 错误的解法
                last_index += new_index + 1 # 正确的解法
                # 然后重新计算最长的子串长度：
                res = max(res, len(s[last_index : i+1])) # 这里是从新的last_index开始，包含当前的i
                print(f'c in cur_sub, res = {res}, cur_sub = {s[last_index : i+1]}, '
                      f'last_index = {last_index}')
            else:
                # 如果当前位置的元素没有在前面一个子串中出现过，
                # 那么就直接把当前i位置的元素包含进来，
                # 然后更新最长子串的长度。
                res = max(res, len(s[last_index : i+1])) # i+1的目的是为了包含当前i位置上的元素
                print(f'c not in cur_sub, res = {res}, cur_sub = {s[last_index : i+1]}')
        
        return res

if __name__ == '__main__':
    sol = Solution()
    s = "abcabcbb"
    res = sol.lengthOfLongestSubstring(s)
    print(f'\nres = {res}')