c# - 如果逗号不在两个双引号之间，则用逗号分隔

Question

我希望用逗号分割这样的字符串：

 field1:"value1", field2:"value2", field3:"value3,value4"

变成一个string[]看起来像：

0     field1:"value1"
1     field2:"value2"
2     field3:"value3,value4"

我正在尝试这样做，Regex.Split但似乎无法计算出正则表达式。

score 7 · Accepted Answer

例如，这样做会Matches比使用容易得多Split

string[] asYouWanted = Regex.Matches(input, @"[A-Za-z0-9]+:"".*?""")
    .Cast<Match>()
    .Select(m => m.Value)
    .ToArray();

尽管如果您的值（或字段！）有可能包含转义引号（或任何类似棘手的东西），那么使用适当的 CSV 解析器可能会更好。

如果您确实在值中转义了引号，我认为以下正则表达式有效- 给它一个测试：

@"field3:""value3\\"",value4""", @"[A-Za-z0-9]+:"".*?(?<=(?<!\\)(\\\\)*)"""

添加(?<=(?<!\\)(\\\\)*)应该确保"它停止匹配之前只有偶数个斜线，因为奇数个斜线意味着它被转义。

score 1 · Accepted Answer

未经测试，但这应该没问题：

string[] parts = string.Split(new string[] { ",\"" }, StringSplitOptions.None);

如果需要，请记住在末尾添加 " 。

score 1 · Accepted Answer

string[] arr = str.Split(new string[] {"\","}}, StringSplitOptions.None).Select(str => str + "\"").ToArray();

按照 webnoob 提到的方式拆分\,，然后使用选择在尾随后缀"，然后转换为数组。

score 0 · Accepted Answer

试试这个

// (\w.+?):"(\w.+?)"        
//         
// Match the regular expression below and capture its match into backreference number 1 «(\w.+?)»        
//    Match a single character that is a “word character” (letters, digits, and underscores) «\w»        
//    Match any single character that is not a line break character «.+?»        
//       Between one and unlimited times, as few times as possible, expanding as needed (lazy) «+?»        
// Match the characters “:"” literally «:"»        
// Match the regular expression below and capture its match into backreference number 2 «(\w.+?)»        
//    Match a single character that is a “word character” (letters, digits, and underscores) «\w»        
//    Match any single character that is not a line break character «.+?»        
//       Between one and unlimited times, as few times as possible, expanding as needed (lazy) «+?»        
// Match the character “"” literally «"»        


try {        
    Regex regObj = new Regex(@"(\w.+?):""(\w.+?)""");        
    Match matchResults = regObj.Match(sourceString);        
    string[] arr = new string[match.Captures.Count];        
    int i = 0;        
    while (matchResults.Success) {        
        arr[i] = matchResults.Value;        
        matchResults = matchResults.NextMatch();        
        i++;        
    }         
} catch (ArgumentException ex) {        
    // Syntax error in the regular expression        
}

score 0 · Accepted Answer

最简单的内置方式在这里。我查过了。它工作正常。它"Hai,\"Hello,World\""分为{"Hai","Hello,World"}

c# - 如果逗号不在两个双引号之间，则用逗号分隔

5 回答 5

Related

Reference