7

我有以下字符串:

a,b,c,d.e(f,g,h,i(j,k)),l,m,n

会知道告诉我如何构建一个正则表达式,它只返回括号的“第一级”,如下所示:

[0] = a,b,c,
[1] = d.e(f,g,h,i.j(k,l))
[2] = m,n

目标是保持括号中具有相同索引的部分嵌套以操纵未来。

谢谢你。

编辑

试图改进这个例子......

想象一下我有这个字符串

username,TB_PEOPLE.fields(FirstName,LastName,TB_PHONE.fields(num_phone1, num_phone2)),password

我的目标是将字符串转换为动态查询。那么不以“TB_”开头的字段我知道它们是主表的字段,否则我知道括号内的informandos字段与另一个表相关。但是我很难检索所有“第一级”字段,因为我可以将它们与相关表分开,我可以递归地恢复剩余的字段。

最后,会有类似的东西:

[0] = username,password
[1] = TB_PEOPLE.fields(FirstName,LastName,TB_PHONE.fields(num_phone1, num_phone2))

我希望我解释得更好一点,对不起。

4

3 回答 3

12

你可以使用这个:

(?>\w+\.)?\w+\((?>\((?<DEPTH>)|\)(?<-DEPTH>)|[^()]+)*\)(?(DEPTH)(?!))|\w+

通过您的示例,您可以获得:

0 => username
1 => TB_PEOPLE.fields(FirstName,LastName,TB_PHONE.fields(num_phone1, num_phone2))
2 => password

解释:

(?>\w+\.)? \w+ \(    # the opening parenthesis (with the function name)
(?>                  # open an atomic group
    \(  (?<DEPTH>)   # when an opening parenthesis is encountered,
                     #  then increment the stack named DEPTH
  |                  # OR
    \) (?<-DEPTH>)   # when a closing parenthesis is encountered,
                     #  then decrement the stack named DEPTH
  |                  # OR
    [^()]+           # content that is not parenthesis
)*                   # close the atomic group, repeat zero or more times
\)                   # the closing parenthesis
(?(DEPTH)(?!))       # conditional: if the stack named DEPTH is not empty
                     #  then fail (ie: parenthesis are not balanced)

您可以使用以下代码进行尝试:

string input = "username,TB_PEOPLE.fields(FirstName,LastName,TB_PHONE.fields(num_phone1, num_phone2)),password";
string pattern = @"(?>\w+\.)?\w+\((?>\((?<DEPTH>)|\)(?<-DEPTH>)|[^()]+)*\)(?(DEPTH)(?!))|\w+";
MatchCollection matches = Regex.Matches(input, pattern);
foreach (Match match in matches)
{
    Console.WriteLine(match.Groups[0].Value);
}
于 2013-10-25T18:54:54.373 回答
0

如果我正确理解你的例子,你正在寻找这样的东西:

(?<head>[a-zA-Z._]+\,)*(?<body>[a-zA-Z._]+[(].*[)])(?<tail>.*)

对于给定的字符串:

用户名,TB_PEOPLE.fields(FirstName,LastName,TB_PHONE.fields(num_phone1, num_phone2)),密码

此表达式将匹配

  • 用户名组长
  • TB_PEOPLE.fields(FirstName,LastName,TB_PHONE.fields(num_phone1, num_phone2))用于组主体
  • ,组密码
于 2013-10-25T18:08:35.210 回答
0

我建议一个新的策略,R2——用算法来做。虽然您可以构建一个最终接近您所要求的正则表达式,但当您发现新的边缘情况时,它将非常难以维护,并且难以扩展。我不会说 C#,但这个伪代码应该能让你走上正轨:

function parenthetical_depth(some_string):
    open = count '(' in some_string
    close = count ')' in some_string
    return open - close

function smart_split(some_string):
    bits = split some_string on ','
    new_bits = empty list
    bit = empty string
    while bits has next:
        bit = fetch next from bits
        while parenthetical_depth(bit) != 0:
            bit = bit + ',' + fetch next from bits
        place bit into new_bits
    return new_bits

这是理解它的最简单方法,目前的算法是O(n^2)- 对内部循环进行了优化O(n)(字符串复制除外,这是其中最糟糕的部分):

depth = parenthetical_depth(bit)
while depth != 0:
    nbit = fetch next from bits
    depth = depth + parenthetical_depth(nbit)
    bit = bit + ',' + nbit

可以通过巧妙地使用缓冲区和缓冲区大小来提高字符串复制的效率,但要以空间效率为代价,但我认为 C# 本身并没有为您提供这种级别的控制。

于 2013-10-25T18:24:36.893 回答