0

我有大量的文本数据,从中我可以到达特定的部分。具体部分如下图:

Caption = "Universal Plug and Play Device Host"
   CheckPoint = 0
   CreationClassName = "Win32_Service"
   Description = "Provides support to host Universal Plug and Play devices."
   DesktopInteract = FALSE
   DisplayName = "Universal Plug and Play Device Host"
   ErrorControl = "Normal"
   ExitCode = 1077
   Name = "upnphost"
   PathName = "C:\\WINDOWS\\system32\\svchost.exe -k LocalService"
   ProcessId = 0
   ServiceSpecificExitCode = 0
   ServiceType = "Share Process"
   Started = FALSE
   StartMode = "Disabled"
   StartName = "NT AUTHORITY\\LocalService"
   State = "Stopped"
   Status = "OK"
   SystemCreationClassName = "Win32_ComputerSystem"
   SystemName = "KYAKKALA-WXP"
   TagId = 0
   WaitHint = 0

我需要将文本分开并存储成组。

我尝试使用以下正则表达式:

String REGEX ="(Caption)\\s=.*?(VMware USB.*)\"\\;\\n((?:(\\w+)\\s+=\\s+(.*)\\n)   {1,21}?)";

通过应用正则表达式,我进入 gp1“caption”、gp2“vmware usb 仲裁服务”、gp3“waithint”和 gp4“0”。我需要获取 21 行的所有数据,
但它只获取第一行和最后一行。

4

2 回答 2

0

您不能用一个正则表达式匹配任意数量的组。
您应该放弃多行匹配,并使用全局修饰符在每一行上使用您的正则表达式。然后你可以遍历结果。

......或者像Deepak Bala所说的那样使用属性。

于 2013-04-10T08:19:15.743 回答
0

*似乎组号在, +or{...}语句中被覆盖。因此,当执行类似的操作时(?:(...))*,每场新比赛都会覆盖第 1 组,因此,在打印第 1 组时,您只会看到最后一场比赛。

但是,您可以执行以下操作:(根据您的需要更改)

String str = "Caption = \"Universal Plug and Play Device Host\"\n"+
  "  CheckPoint = 0\n"+
  "  CreationClassName = \"Win32_Service\"\n"+
  "  Description = \"Provides support to host Universal Plug and Play devices\"";

String regex = "(?:^|\n)\\s*(\\w*)\\s*=\\s*(.*?)(?=\r?\n|$)";

Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(str);
while (m.find()) {
   System.out.println("Name = " + m.group(1));
   System.out.println("Value = " + m.group(2));
}

这将打印:

Name = Caption
Value = "Universal Plug and Play Device Host"
Name = CheckPoint
Value = 0
Name = CreationClassName
Value = "Win32_Service"
Name = Description
Value = "Provides support to host Universal Plug and Play devices"
于 2013-04-10T08:19:41.400 回答