我正在尝试使用 Java 的 Pattern 和 Matcher 执行多个字符串替换,其中正则表达式模式可能包含元字符(例如 \b、() 等)。例如,对于输入 string fit i am
,我想应用替换:
\bi\b --> EYE
i --> I
然后我遵循了两个问题的编码模式(Java Replacing multiple different substrings in a string,Replacing multiple substrings in Java when replacement text overlays search text)。在两者中,他们创建了一个 or'ed 搜索模式(例如 foo|bar)和一个(模式,替换)的映射,并且在matcher.find()
循环内部,他们查找并应用替换。
我遇到的问题是该matcher.group()
函数不包含有关匹配元字符的信息,因此我无法区分i
和\bi\b
。请看下面的代码。我能做些什么来解决这个问题?
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.*;
public class ReplacementExample
{
public static void main(String argv[])
{
Map<String, String> replacements = new HashMap<String, String>();
replacements.put("\\bi\\b", "EYE");
replacements.put("i", "I");
String input = "fit i am";
String result = doit(input, replacements);
System.out.printf("%s\n", result);
}
public static String doit(String input, Map<String, String> replacements)
{
String patternString = join(replacements.keySet(), "|");
Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(input);
StringBuffer resultStringBuffer = new StringBuffer();
while (matcher.find())
{
System.out.printf("match found: %s at start: %d, end: %d\n",
matcher.group(), matcher.start(), matcher.end());
String matchedPattern = matcher.group();
String replaceWith = replacements.get(matchedPattern);
// Do the replacement here.
matcher.appendReplacement(resultStringBuffer, replaceWith);
}
matcher.appendTail(resultStringBuffer);
return resultStringBuffer.toString();
}
private static String join(Set<String> set, String delimiter)
{
StringBuilder sb = new StringBuilder();
int numElements = set.size();
int i = 0;
for (String s : set)
{
sb.append(Pattern.quote(s));
if (i++ < numElements-1) { sb.append(delimiter); }
}
return sb.toString();
}
}
这打印出来:
match found: i at start: 1, end: 2
match found: i at start: 4, end: 5
fIt I am
理想情况下,它应该是fIt EYE am
。