0

我有一个很长的文字,在文字中有很多这样的东西(你好,嗨)或(你好,嗨),我必须考虑到空间。如何在长文本中检测它们并检索 hello 和 hi 词并从文本中添加到列表中?目前我使用这个正则表达式:

   string helpingWordPattern = "(?<=\\()(.*?)(?<=\\))";
   Regex regexHelpingWord = new Regex(helpingWordPattern);

        foreach (Match m in regexHelpingWord.Matches(lstQuestion.QuestionContent))
       {
           //  removing "," and store helping word into a list
           string str = m.ToString();
           if (str.Contains(","))
           {
                string[] strWords = str.Split(','); // Will contain a ) with a word , e.g. ( whole) ) 

               if(strWords.Contains(")")) 
               {
                   strWords.Replace(")", ""); // Try to remove them. ERROR here cos i can't use array with replace.
               }

                   foreach (string words in strWords)
                   {
                       options.Add(words);
                   }

           }
       }

我谷歌并搜索正确的正则表达式,我使用的正则表达式也想删除 ) 但它没有。

4

3 回答 3

3

\\( \\)括号匹配器放在您要捕获的组之外?

Regex regex = new Regex( "\\((.*?)\\)");
foreach (Match m in regex.Matches( longText)) {
    string inside = Match.Groups[1];  // inside the brackets.
    ...
}

然后使用Match.Groups[1],而不是匹配的整个文本。

于 2013-08-01T02:47:38.047 回答
2

您还可以使用此正则表达式模式:

(?<=[\(,])(.*?)(?=[\),])
(?<=[\(,])(\D*?)(?=[\),])  // for anything except number

分手:

(?<=[\(,])  = Positive look behind, looks for `(`or `,` 
(.*?)       = Looks for any thing except new line, but its lazy(matches as less as possible)  
(?=[\),])   = Positive look ahead, looks for `)` or `,` after `hello` or `hi` etc.

演示

编辑

您可以尝试此示例代码以取得成就:(未经测试)

List<string> lst = new List<string>();
MatchCollection mcoll = Regex.Matches(sampleStr,@"(?<=[\(,])(.*?)(?=[\),])")

foreach(Match m in mcoll)
{
    lst.Add(m.ToString());
    Debug.Print(m.ToString());   // Optional, check in Output window.
}
于 2013-08-01T03:03:09.080 回答
2

有很多不同的方法可以做到这一点......下面是一些使用正则表达式匹配/拆分的代码。


string input = "txt ( apple , orange) txt txt txt ( hello, hi,5 ) txt txt txt txt";

List Options = new List();

Regex regexHelpingWord = new Regex(@"\((.+?)\)");

foreach (Match m in regexHelpingWord.Matches(input))
{

    string words = Regex.Replace(m.ToString(), @"[()]", "");

    Regex regexSplitComma = new Regex(@"\s*,\s*");


    foreach (string word in regexSplitComma.Split(words))
    {
        string Str = word.Trim();
        double Num;
        bool isNum = double.TryParse(Str, out Num);
        if (!isNum) Options.Add(Str);
    }

}
于 2013-08-01T03:00:52.470 回答