0

Ok so I have a regex and I need it to find matches in a multiline string. This is the string I am using:

Device Identifier:        disk0
Device Node:              /dev/disk0
Part of Whole:            disk0
Device / Media Name:      OCZ-VERTEX2 Media 

Volume Name:              Not applicable (no file system)

Mounted:                  Not applicable (no file system)

File System:              None

Content (IOContent):      GUID_partition_scheme
OS Can Be Installed:      No
Media Type:               Generic
Protocol:                 SATA
SMART Status:             Verified

Total Size:               240.1 GB (240057409536 Bytes) (exactly 468862128 512-Byte-Blocks)
Volume Free Space:        Not applicable (no file system)
Device Block Size:        512 Bytes

Read-Only Media:          No
Read-Only Volume:         Not applicable (no file system)
Ejectable:                No

Whole:                    Yes
Internal:                 Yes
Solid State:              Yes
OS 9 Drivers:             No
Low Level Format:         Not supported

Basically I need to separate each line into two groups with the colon as the separator. The regex I am using is:

@"([A-Za-z0-9\(\) \-\/]+):([A-Za-z0-9\(\) \-\/]+).*"

It does work but only picks up the first line and separates it into the two groups like I want but it stops at that point. I have tried the Multiline option but it doesn't make any difference.

I must admit I am new to the regex world.

Any help is appreciated.

4

4 回答 4

2

The following example seems to work, and also uses named groups to make comprehension of the regular expression a bit easier.

    var rgx = new System.Text.RegularExpressions.Regex(@"(?<Key>[^:\r\n]+):([\s]*)(?<Value>[^\r\n]*)");
    foreach (var match in rgx.Matches(str).Cast<Match>())
    {
        Console.WriteLine("{0}: {1}", match.Groups["Key"].Value, match.Groups["Value"].Value);
    }

For fun, this converts the whole thing to an easy to use dictionary:

var dictionary = rgx.Matches(str).Cast<Match>().ToDictionary(match => match.Groups["Key"].Value, match => match.Groups["Value"].Value);
于 2012-04-08T11:31:06.123 回答
0

The problem with your Regex is the last .*. It matches the \r\n and so the whole rest string is matched.

于 2012-04-08T11:26:56.730 回答
0

I would suggest using String.Split instead. Assuming all your keys are unique:

string[] lines = str.Split(new char[] { '\r', '\n'} , 
    StringSplitOptions.RemoveEmptyEntries);

Dictionary<string, string> dict = lines.ToDictionary(
    line => line.Split(':').First(), 
    line => line.Split(new char[] { ':' }, 2).Last().Trim());
于 2012-04-08T11:29:10.893 回答
0

If you are using the regex option SingleLine then the .* while match the entire remaining string and so there is only one match.

SingleLine tells the regex parser to additionally accept line feeds (ie \n) when doing a match on .

Do you even need the .* at all?

Alternative you could use

^([A-Za-z0-9\(\) \-\/]+):([A-Za-z0-9\(\) \-\/]+)$

Aslong as you use it with the regex option MultiLine to make ^$ match start and end of lines rather than string.

于 2012-04-08T11:31:46.043 回答