5

假设我们有一个这样的字符串:

"abcdaaaaefghaaaaaaaaa"
"012003400000000"

我想删除最后的重复字符,以获得这个:

"abcdaaaaefgh"
"0120034"

有没有一种简单的方法来做到这一点,用正则表达式?我对此有点困难,我的代码开始看起来像一个巨大的怪物......

一些澄清:

  • 什么被认为是重复的?

    结尾至少包含2 个字符的序列。一个字符不被视为重复。例如: in "aaaa",'a'不被认为是重复的,但 in "baaaa", 它是。因此,在 的情况下"aaaa",我们不必对 String 进行任何更改。另一个例子:"baa"必须给"b"

  • 对于只有一个字符的字符串?

    必须返回一个像"a"我们只有 char 这样的字符串而不做任何更改,即我们必须 return 。'a'"a"

4

5 回答 5

10

您可以replaceAll()与反向引用一起使用:

str = str.replaceAll("(.)\\1+$", "");

编辑

为了满足不能删除整个字符串的要求,我只需在之后添加一个检查,而不是使正则表达式过于复杂:

public String replaceLastRepeated(String str) {
    String replaced = str.replaceAll("(.)\\1+$", "");
    if (replaced.equals("")) {
        return str;
    }
    return replaced;
}
于 2013-04-02T13:24:44.390 回答
3

我不认为我会为此使用正则表达式:

public static String removeRepeatedLastCharacter(String text) {
    if (text.length() == 0) {
        return text;
    }
    char lastCharacter = text.charAt(text.length() - 1);
    // Look backwards through the string until you find anything which isn't
    // the final character
    for (int i = text.length() - 2; i >= 0; i--) {
        if (text.charAt(i) != lastCharacter) {
            // Add one to *include* index i
            return text.substring(0, i + 1);
        }
    }
    // Looks like we had a string such as "1111111111111".
    return "";
}

我个人觉得这比正则表达式更容易理解。它可能会或可能不会更快 - 我不想做出预测。

请注意,这将始终删除最后一个字符,无论它是否重复。这意味着单个字符串将始终以空字符串结尾:

"" => ""
"x" => ""
"xx" => ""
"ax" => "a"
"abcd" => "abc"
"abcdddd" => "abc"
于 2013-04-02T13:26:59.200 回答
3

我不会使用正则表达式:

public class Test {
  public void test() {
    System.out.println(removeTrailingDupes("abcdaaaaefghaaaaaaaaa"));
    System.out.println(removeTrailingDupes("012003400000000"));
    System.out.println(removeTrailingDupes("0120034000000001"));
    System.out.println(removeTrailingDupes("cc"));
    System.out.println(removeTrailingDupes("c"));
  }

  private String removeTrailingDupes(String s) {
    // Is there a dupe?
    int l = s.length();
    if (l > 1 && s.charAt(l - 1) == s.charAt(l - 2)) {
      // Where to cut.
      int cut = l - 2;
      // What to cut.
      char c = s.charAt(cut);
      while (cut > 0 && s.charAt(cut - 1) == c) {
        // Cut that one too.
        cut -= 1;
      }
      // Cut off the repeats.
      return s.substring(0, cut);
    }
    // Return it untouched.
    return s;
  }

  public static void main(String args[]) {
    new Test().test();
  }
}

匹配@JonSkeet 的“规格”:

请注意,这只会删除最后重复的字符。这意味着不会触及单个字符串,但如果两个字符相同,则两个字符串可能会变为空:

"" => ""
"x" => "x"
"xx" => ""
"aaaa" => ""
"ax" => "ax"
"abcd" => "abcd"
"abcdddd" => "abc"

我想知道是否有可能在正则表达式中实现这种级别的控制?

由于...添加,但如果我们将此正则表达式与 aaaa 一起使用,则它不会返回任何内容。它应该返回 aaaa。评论:

相反,使用:

  private String removeTrailingDupes(String s) {
    // Is there a dupe?
    int l = s.length();
    if (l > 1 && s.charAt(l - 1) == s.charAt(l - 2)) {
      // Where to cut.
      int cut = l - 2;
      // What to cut.
      char c = s.charAt(cut);
      while (cut > 0 && s.charAt(cut - 1) == c) {
        // Cut that one too.
        cut -= 1;
      }
      // Cut off the repeats.
      return cut > 0 ? s.substring(0, cut): s;
    }
    // Return it untouched.
    return s;
  }

其中有合同:

"" => ""
"x" => "x"
"xx" => "xx"
"aaaa" => "aaaa"
"ax" => "ax"
"abcd" => "abcd"
"abcdddd" => "abc"
于 2013-04-02T13:32:23.623 回答
0

替换(.)\1+$为空字符串:

"abcddddd".replaceFirst("(.)\\1+$", ""); // returns abc
于 2013-04-02T13:25:58.390 回答
0

这应该可以解决问题:

public class Remover {
     public static String removeTrailing(String toProcess)
     {
        char lastOne = toProcess.charAt(toProcess.length() - 1);
        return toProcess.replaceAll(lastOne + "+$", "");
     } 

     public static void main(String[] args)
     {
        String test1 = "abcdaaaaefghaaaaaaaaa";
        String test2 = "012003400000000";

        System.out.println("Test1 without trail : " + removeTrailing(test1));
        System.out.println("Test2 without trail : " + removeTrailing(test2));
     }
}
于 2013-04-02T13:36:07.920 回答