java - 捕获字符串上匹配字符（单个或重复）之间的所有字符

Question

我正在尝试提取特定字符之前的字符串（即使字符重复，就像这样（即：下划线'_'）：

this_is_my_example_line_0
this_is_my_example_line_1_
this_is_my_example_line_2___
_this_is_my_ _example_line_3_
__this_is_my___example_line_4__

在运行我的正则表达式后，我应该得到这个（正则表达式应该忽略字符串中间匹配字符的任何实例）：

this_is_my_example_line_0
this_is_my_example_line_1
this_is_my_example_line_2
this_is_my_ _example_line_3
this_is_my___example_line_4

换句话说，我试图在字符串的开头和结尾“修剪”匹配的字符。

我正在尝试在 Java 中使用 Regex 来完成此操作，我的想法是捕获行尾或行首的特殊字符之间的字符组。

到目前为止，我只能用这个正则表达式成功地做到这一点，例如 3：

/[^_]+|_+(.*)[_$]+|_$+/

[^_]+ not 'underscore' once or more 
| OR 
_+ underscore once or more
(.*) capture all characters
[_$]+ not 'underscore' once or more followed by end of line
 |_$+ OR 'underscore' once or more followed by end of line

我刚刚意识到这不包括示例 0,1,2 上消息的第一个单词，因为该字符串不以下划线开头，并且仅在找到下划线后才开始匹配..

有没有更简单的方法不涉及正则表达式？我真的不关心第一个字符（虽然它会很好）我只需要忽略最后的重复字符..看起来（通过这个正则表达式测试器）只是这样做，会工作吗？/()_+$/空括号匹配行尾的单个或重复匹配之前的任何内容..这是否正确？

谢谢！

score 3 · Accepted Answer

这里有几个选项，您可以^_+|_+$用空字符串替换匹配项，或者从匹配项中提取第一个捕获组的内容^_*(.*?)_*$。请注意，如果您的字符串可能是多行，并且您希望在每一行上执行替换，那么您将需要Pattern.MULTILINE为任一方法使用该标志。如果您的字符串可能是多行并且您只想在开头和结尾进行替换，请不要使用Pattern.MULTILINE而是使用Pattern.DOTALL第二种方法。

例如：http ://regexr.com?355ff

score 2 · Accepted Answer

怎么样[^_\n\r](.*[^_\n\r])?？

演示

String data=
        "this_is_my_example_line_0\n" +
        "this_is_my_example_line_1_\n" +
        "this_is_my_example_line_2___\n" +
        "_this_is_my_ _example_line_3_\n" +
        "__this_is_my___example_line_4__";

Pattern p=Pattern.compile("[^_\n\r](.*[^_\n\r])?");
Matcher m=p.matcher(data);
while(m.find()){
    System.out.println(m.group());
}

输出：

this_is_my_example_line_0
this_is_my_example_line_1
this_is_my_example_line_2
this_is_my_ _example_line_3
this_is_my___example_line_4

java - 捕获字符串上匹配字符（单个或重复）之间的所有字符

2 回答 2

Related

Reference