1

好的,我有一个非常明显但显然很重要的问题要解决。

假设我有一个简单的字符串ab
现在我想abb替换a,所以我以ba.

解决方案是连续进行两次替换。但结果要么取决于订单,要么aa取决于bb订单。

显然,生产环境将不得不处理比两个更复杂的字符串和更多的替换,但问题仍然存在。

我的一个想法是保存我替换某些东西的位置。但是,一旦替换的针的长度与原来的针不同,这让我大吃一惊。

这是一般问题,但我正在使用 C#。这是我想出的一些代码:

string original = "abc";

Regex[] expressions = new Regex[]
{
    new Regex("a"), //replaced by ab
    new Regex("b") //replaced by c
};

string[] replacements = new string[]
{
    "ab",
    "c"
};

for (int i = 0; i < expressions.Length; i++)
    original = expressions[i].Replace(original, replacements[i]);

//Expected result: abcc
//Actual result: accc <- the b is replaced by c in the second pass.

那么有没有简单的方法来解决这个问题?

4

3 回答 3

1

这是一个解决方案。对字符串尝试所有正则表达式,在最早的匹配上进行替换,然后在字符串的剩余部分上递归。如果您需要它更快但更复杂,您可以Matches()在开始时要求所有权利并从左到右处理它们,在Indexes用更长和更短的字符串替换表达式时调整它们,并丢弃任何重叠。

using System;
using System.IO;
using System.Text.RegularExpressions;

class MultiRegex {

    static String Replace(String text, Regex[] expressions,
            String[] replacements, int start=0)
    {
        // Try matching each regex; save the first match
        Match firstMatch = null;
        int firstMatchingExpressionIndex = -1;
        for (int i = 0; i < expressions.Length; i++) {
            Regex r = expressions[i];
            Match m = r.Match(text, start);
            if (m.Success
                    && (firstMatch == null || m.Index < firstMatch.Index))
            {
                firstMatch = m;
                firstMatchingExpressionIndex = i;
            }
        }

        if (firstMatch == null) {
            /* No matches anywhere */
            return text;
        }

        // Replace text, then recurse
        String newText = text.Substring(0, firstMatch.Index)
            + replacements[firstMatchingExpressionIndex]
            + text.Substring(firstMatch.Index + firstMatch.Length);
        return Replace(newText, expressions, replacements,
                start + replacements[firstMatchingExpressionIndex].Length);
    }

    public static void Main() {

        Regex[] expressions = new Regex[]
        {
            new Regex("a"), //replaced by ab
            new Regex("b") //replaced by c
        };

        string[] replacements = new string[]
        {
            "ab",
            "c"
        };

        string original = "a b c";
        Console.WriteLine(
                Replace(original, expressions, replacements));

        // Should be "baz foo bar"
        Console.WriteLine(Replace("foo bar baz",
                    new Regex[] { new Regex("bar"), new Regex("baz"),
                        new Regex("foo") },
                    new String[] { "foo", "bar", "baz" }));
    }
}

这打印:

ab c c
baz foo bar
于 2013-02-05T22:03:38.137 回答
1

如果您正在谈论简单的一对一转换,则转换为 char 数组并进行切换可能是理想的,但是您似乎正在寻找更复杂的替代品。

基本上,诀窍是创建一个中间字符来标记你的临时角色。这里没有显示实际代码,而是字符串转换后的样子:

ab
%1b
%1%2
b%2
ba

所以基本上,替换%%%,然后是第一个匹配项,%1依此类推。一旦它们全部完成,替换%1为它的输出等等,最后替换%%%.

但是要小心,如果你能保证你的中间语法不会污染你的输入,你就可以了,如果你不能,你将需要使用技巧来确保你没有以奇数个%. (所以%%a会匹配,但%%%a不会,因为这意味着特殊值%a

于 2013-02-05T22:06:06.230 回答
0

如果你(\ba\b)用来表示匹配字母a并且只匹配字母a,whileab将不会被匹配。类似的b,它会是(\bb\b)

 string original = "a b c";
 Regex[] expressions = new Regex[] {
      // @ sign used to signify a literal string
      new Regex(@"(\ba\b)"), // \b represents a word boundary, between a word and a space
      new Regex(@"(\bb\b)"),
 };
 string[] replacements = new string[] {
      "ab",
      "c"
 };
 for(int i = 0; i < expressions.Length; i++)
      original = expressions[i].Replace(original, replacements[i]);

编辑1:问题更改为要匹配的字母之间没有空格,想要相同abcc的 from abc,我只是颠倒了检查正则表达式的顺序。

 Regex[] expressions = new Regex[] {
      new Regex(@"b"), //replaced by c
      new Regex(@"a"), //replaced by ab
 };
 string[] replacements = new string[] {
      "c",
      "ab",
 };

编辑2:答案更改为反映要匹配的可变长度,此匹配基于要检查的模式顺序,检查模式,然后移动到新字符串

 string original = "a bc";

 Regex[] expressions = new Regex[] {
      new Regex(@"a"), //replaced by ab
      new Regex(@"b"), //replaced by c
 };

 string[] replacements = new string[] {
      "ab",
      "c",
 };
 string newString = string.Empty;
 string workingString = string.Empty;
 // Position of start point in string
 int index = 0;
 // Length to retrieve
 int length = 1;
 while(index < original.Length) {
      // Retrieve a piece of the string
      workingString = original.Substring(index, length);
      // Whether the expression has been matched
      bool found = false;
      for(int i = 0; i < expressions.Length && !found; i++) {
           if(expressions[i].Match(workingString).Success) {
                // If expression matched, add the replacement value to the new string
                newString += expressions[i].Replace(workingString, replacements[i]);
                // Mark expression as found
                found = true;
           }
      }
      if(!found) {
           // If not found, increase length (check for more than one character patterns)
           length++;
           // If the rest of the entire string doesn't match anything, move the character at **index** into the new string
           if(length >= (original.Length - index)) {
                newString += original.Substring(index, 1);
                index++;
                length = 1;
           }
      }
      // If a match was found, start over at next position in string
      else {
           index += length;
           length = 1;
      }
 }
于 2013-02-05T21:16:36.750 回答