java - 不重复字符的最长公共子串的长度

Question

给定“abcabcbb”，答案是“abc”，长度为 3。

给定“bbbbb”，答案是“b”，长度为 1。

给定“pwwkew”，答案是“wke”，长度为3。注意答案必须是子串，“pwke”是子序列而不是子串。

我想出了一个可行的解决方案，但在几个测试用例中都失败了。然后我找到了一个更好的解决方案，我重写了它以尝试理解它。下面的解决方案完美无缺，但经过大约 2 个小时的与这件事的斗争，我仍然不明白为什么这行特定的代码行得通。

import java.util.*;
import java.math.*;

public class Solution {

  public int lengthOfLongestSubstring(String str) {

  if(str.length() == 0)
    return 0;

  HashMap<Character,Integer> map = new HashMap<>();
  int startingIndexOfLongestSubstring = 0;
  int max = 0;

  for(int i = 0; i < str.length(); i++){
      char currentChar = str.charAt(i); 
      if(map.containsKey(currentChar))
         startingIndexOfLongestSubstring = Math.max(startingIndexOfLongestSubstring, map.get(currentChar) + 1);

      map.put(currentChar, i);
      max = Math.max(max, i - startingIndexOfLongestSubstring + 1);

      }//End of loop

    return max;

   }
}

有问题的行是

max = Math.max(max, i - startingIndexOfLongestSubstring + 1);

我不明白为什么会这样。我们取上一个最大值之间的最大值，以及我们当前索引和当前最长子字符串的起始索引之间的差异，然后加 1。我知道代码正在获取我们当前索引和startingIndexOfSubstring，但我无法概念化为什么它可以为我们提供预期的结果；有人可以向我解释这一步，特别是为什么它有效吗？

score 3 · Accepted Answer

我通常不善于解释，让我通过考虑一个例子来试一试。

字符串是“wcabcdeghi”。

暂时忘记代码，假设我们正在尝试提出一个逻辑。

我们从 w 开始并继续前进，直到我们到达 c -> a -> b -> c。
我们需要在这一点停下来，因为“c”在重复。所以我们需要一个地图来存储一个字符是否重复。（在代码中 map.put(currentChar, i);：）
既然我们知道一个字符是否重复，我们需要知道最大值是多少。到目前为止的长度。（在代码中 -）max
现在我们知道跟踪前 2 个变量 w->c 的计数没有意义。这是因为包括这个在内，我们已经得到了 Max。价值。所以从下一次迭代开始，我们只需要从 a -> b -> 很快检查长度。
让我们有一个变量（在代码中 -）startingIndexOfLongestSubstring来跟踪这一点。（这应该被命名为startingIndexOfNonRepetativeCharacter，然后我也不擅长命名）。
现在我们再次继续，但是等等，我们还没有最终确定如何跟踪我们当前正在解析的子字符串。（即，来自 abcd ...）
想起来，我所需要的只是“a”所在的位置（即startingIndexOfNonRepetativeCharacter）所以要知道当前子字符串的长度，我需要做的就是（在代码 - ) i - startingIndexOfLongestSubstring + 1（当前字符位置 - 非重复字符长度 + （减法不包括两边，所以加 1）。让我们称之为currentLength
但是等等，我们将如何处理这个计数。每次我们找到一个新变量时，我们都需要检查这是否currentLength会打破我们的最大值。所以（在代码中 -）max = Math.max(max, i - startingIndexOfLongestSubstring + 1);
现在我们已经涵盖了我们需要的大部分语句，并且根据我们的逻辑，每次遇到一个已经存在的变量时，我们所需要的只是startingIndexOfLongestSubstring = map.get(currentChar). 那么我们为什么要做一个Max？
考虑一个字符串是“wcabcdewghi”的场景。当我们开始将我们的新计数器处理为 a -> b -> c -> d -> e -> w此时，我们的逻辑检查此字符之前是否存在。从现在开始，它从索引“1”开始计数。这完全搞砸了整个计数。所以我们需要确保，我们从 map 中获取的下一个索引总是大于我们计数的起点（即，只有当字符出现在之前时，才从 map 中选择一个字符startingIndexOfLongestSubstring）。

希望我已经回答了代码中的所有行，主要是如果解释可以理解的话。

score 1 · Accepted Answer

因为

i - startingIndexOfLongestSubstring + 1

i是和startingIndexOfLongestSubstring索引之间的字符数量。例如位置 2 和 3 之间有多少个字符？3-2=1但我们有 2 个字符：在位置 2 和位置 3。

我已经描述了代码中的每一个动作：

public class Solution {

    public int lengthOfLongestSubstring(String str) {

        if(str.length() == 0)
            return 0;

        HashMap<Character,Integer> map = new HashMap<>();
        int startingIndexOfLongestSubstring = 0;
        int max = 0;

        // loop over all characters in the string
        for(int i = 0; i < str.length(); i++){
            // get character at position i
            char currentChar = str.charAt(i);
            // if we already met this character
            if(map.containsKey(currentChar))
                // then get maximum of previous 'startingIndexOfLongestSubstring' and 
                // map.get(currentChar) + 1 (it is last occurrence of the current character in our word before plus 1)
                // "plus 1" - it is because we should start count from the next character because our current character 
                // is the same
                startingIndexOfLongestSubstring = Math.max(startingIndexOfLongestSubstring, map.get(currentChar) + 1);

            // save position of the current character in the map. If map already has some value for current character 
            // then it will override (we don't want to know previous positions of the character)
            map.put(currentChar, i);
            // get maximum between 'max' (candidate for return value) and such value for current character
            max = Math.max(max, i - startingIndexOfLongestSubstring + 1);

        }//End of loop

        return max;

    }
}

java - 不重复字符的最长公共子串的长度

2 回答 2

Related

Reference