3

我希望用逗号分割这样的字符串:

 field1:"value1", field2:"value2", field3:"value3,value4"

变成一个string[]看起来像:

0     field1:"value1"
1     field2:"value2"
2     field3:"value3,value4"

我正在尝试这样做,Regex.Split但似乎无法计算出正则表达式。

4

5 回答 5

7

例如,这样做会Matches比使用容易得多Split

string[] asYouWanted = Regex.Matches(input, @"[A-Za-z0-9]+:"".*?""")
    .Cast<Match>()
    .Select(m => m.Value)
    .ToArray();

尽管如果您的值(或字段!)可能包含转义引号(或任何类似棘手的东西),那么使用适当的 CSV 解析器可能会更好。


如果您确实在值中转义了引号,我认为以下正则表达式有效- 给它一个测试:

@"field3:""value3\\"",value4""", @"[A-Za-z0-9]+:"".*?(?<=(?<!\\)(\\\\)*)"""

添加(?<=(?<!\\)(\\\\)*)应该确保"它停止匹配之前只有偶数个斜线,因为奇数个斜线意味着它被转义。

于 2012-12-17T14:22:04.887 回答
1

未经测试,但这应该没问题:

string[] parts = string.Split(new string[] { ",\"" }, StringSplitOptions.None);

如果需要,请记住在末尾添加 " 。

于 2012-12-17T14:25:24.870 回答
1
string[] arr = str.Split(new string[] {"\","}}, StringSplitOptions.None).Select(str => str + "\"").ToArray();

按照 webnoob 提到的方式拆分\,,然后使用选择在尾随后缀",然后转换为数组。

于 2012-12-17T14:27:23.110 回答
0

试试这个

// (\w.+?):"(\w.+?)"        
//         
// Match the regular expression below and capture its match into backreference number 1 «(\w.+?)»        
//    Match a single character that is a “word character” (letters, digits, and underscores) «\w»        
//    Match any single character that is not a line break character «.+?»        
//       Between one and unlimited times, as few times as possible, expanding as needed (lazy) «+?»        
// Match the characters “:"” literally «:"»        
// Match the regular expression below and capture its match into backreference number 2 «(\w.+?)»        
//    Match a single character that is a “word character” (letters, digits, and underscores) «\w»        
//    Match any single character that is not a line break character «.+?»        
//       Between one and unlimited times, as few times as possible, expanding as needed (lazy) «+?»        
// Match the character “"” literally «"»        


try {        
    Regex regObj = new Regex(@"(\w.+?):""(\w.+?)""");        
    Match matchResults = regObj.Match(sourceString);        
    string[] arr = new string[match.Captures.Count];        
    int i = 0;        
    while (matchResults.Success) {        
        arr[i] = matchResults.Value;        
        matchResults = matchResults.NextMatch();        
        i++;        
    }         
} catch (ArgumentException ex) {        
    // Syntax error in the regular expression        
}
于 2012-12-17T14:26:17.167 回答
0

最简单的内置方式在这里。我查过了。它工作正常。它"Hai,\"Hello,World\""分为{"Hai","Hello,World"}

于 2014-03-19T08:14:52.500 回答