3

我想写一个可以删除“。”的正则表达式。最后(可以在 centa 或 centb 内)标记并将其放在前面  

String input1 = "this is a &emsp; <centa> test.</centa>" 
String output1 = "this is a .&emsp;<centa> test</centa>" 

或者

 String input1b = "this is a &emsp; <centb> test.</centb>" 
    String output1b = "this is a .&emsp;<centb> test</centb>" 

或者

String input3 = "this is a &emsp; test." 
String output3 = "this is a .&emsp; test" 

我只能在字符串上使用 replaceAll,如何在下面的代码中创建模式?替换字符串应该是什么?

Pattern rulerPattern1 = Pattern.compile("", Pattern.MULTILINE);
System.out.println(rulerPattern1.matcher(input1).replaceAll(""));

这个边缘案例是由请求者在评论中提出的

string input4 = "&ldquo;[<deleted.material>[</deleted.material>]&sect;&ensp;431:10A&ndash;126&em‌​sp;[<deleted.material>]Chemotherapy services.</deleted.material>] <added.material>Cancer treatment.</added.material>test snl."
string output4 = "&ldquo;[<deleted.material>[</deleted.material>]&sect;&ensp;431:10A&ndash;126.&em‌​sp;[<deleted.material>]Chemotherapy services.</deleted.material>] <added.material>Cancer treatment.</added.material>test snl"
4

3 回答 3

5

描述

此正则表达式将找到&emsp;并移动字符串中的最后一个点到&emsp;

正则表达式: ([&]emsp;[^.]*)\.

用。。。来代替.$1

在此处输入图像描述

给定您的输入文本示例:

this is a &emsp; <centa> test.</centa>
this is a &emsp; <centb> test.</centb> 
this is a &emsp; test.

这将分别返回以下几行

this is a .&emsp;<centa> test</centa>
this is a .&emsp;<centb> test</centb>
this is a .&emsp; test

如果您想移动字符串中的最后一个点,那么您可以使用它

正则表达式:([&]em‌​sp;.*)\.

用。。。来代替 .$1

在此处输入图像描述

鉴于您的输入文本:

&ldquo;[<deleted.material>[</deleted.material>]&sect;&ensp;431:10A&ndash;126&em‌​sp;[<deleted.material>]Chemotherapy services.</deleted.material>] <added.material>Cancer treatment.</added.material>test snl.

退货

&ldquo;[<deleted.material>[</deleted.material>]&sect;&ensp;431:10A&ndash;126.&em‌​sp;[<deleted.material>]Chemotherapy services.</deleted.material>] <added.material>Cancer treatment.</added.material>test snl
于 2013-06-12T02:16:16.713 回答
1

我只能在字符串上使用 replaceAll

好的,奇怪的要求,但这是我的解决方案。我需要使用 replaceAll 两次来覆盖有/没有标签的场景。

private String parse(final String input) {
    return input.replaceAll("this is a &emsp; <(cent(a|b))> test\\.</\\1>", 
        "this is a .&emsp;<$1> test</$1>")
        .replaceAll("&emsp; test.", ".&emsp; test");
}

@Test
public void centa() {
    // Arrange
    final String input = "this is a &emsp; <centa> test.</centa>";

    // Act
    final String output = parse(input);

    // Assert
    assertEquals("this is a .&emsp;<centa> test</centa>", output);
}

@Test
public void centb() {
    // Arrange
    final String input = "this is a &emsp; <centb> test.</centb>";

    // Act
    final String output = parse(input);

    // Assert
    assertEquals("this is a .&emsp;<centb> test</centb>", output);
}

@Test
public void noTags() {
    // Arrange
    final String input = "this is a &emsp; test.";

    // Act
    final String output = parse(input);

    // Assert
    assertEquals("this is a .&emsp; test", output);
}
于 2013-06-12T00:11:29.533 回答
0

尝试将您的代码与单个 replaceAll 匹配。这应该满足您的 3 个测试用例。

第 1 组和第 2 组是分开的,因此我们可以在其中放置一个点。
第 2 组和第 4 组是分开的,因此我们可以删除其中的点。

Pattern rulerPattern1 = Pattern.compile("([\\W\\w]+)(&emsp;(<cent[ab]>)?[\\W\\w]+)\\.(</cent[ab]>)?", Pattern.MULTILINE);
System.out.println(rulerPattern1.matcher(input1).replaceAll("$1.$2$4"));
于 2013-06-12T00:24:53.910 回答