1

例如,我有一个输入:("Test your Internet connection bandwidth. Test your Internet connection bandwidth."重复两次)并且我想搜索字符串internetbandwidth

string keyword = tbSearch.Text //That holds value: "internet bandwidth"
string input = "Test your Internet connection bandwidth. Test your Internet connection bandwidth.";

Regex r = new Regex(keyword.Replace(' ', '|'), RegexOptions.IgnoreCase);
if (r.Matches(input).Count == siteKeyword.Split(' ').Length)
{
    //Do something
}

这不起作用,因为它找到 2 个“互联网”和 2 个“带宽”,所以它计数为 4,但关键字长度为 2。那我能做什么?

4

4 回答 4

4
var pattern = keyword.Split()
        .Aggregate(new StringBuilder(),
                   (sb, s) => sb.AppendFormat(@"(?=.*\b{0}\b)", Regex.Escape(s)),
                   sb => sb.ToString());

if (Regex.IsMatch(input, pattern, RegexOptions.IgnoreCase))
{
    // contains all keywords
}

第一部分是从您的关键字生成模式。如果有两个关键字"internet bandwidth",则生成的正则表达式模式将如下所示:

"(?=.*\binternet\b)(?=.*\bbandwidth\b)"

它将匹配以下输入:

"Test your Internet connection bandwidth."
"Test your Internet connection bandwidth. Test your Internet bandwidth."

以下输入将不匹配(并非包含所有单词):

"Test your Internet2 connection bandwidth bandwidth."
"Test your connection bandwidth."

另一种选择(分别验证每个关键字):

var allWordsContained = keyword.Split().All(word => 
    Regex.IsMatch(input, String.Format(@"\b{0}\b", Regex.Escape(word)), RegexOptions.IgnoreCase));
于 2013-01-15T14:43:48.483 回答
0
Regex r = new Regex(keyword.Replace(' ', '|'), RegexOptions.IgnoreCase);
int distinctKeywordsFound = r.Matches(input)
                             .Cast<Match>()
                             .Select(m => m.Value)
                             .Distinct()
                             .Count();
if (distinctKeywordsFound == siteKeyword.Split(' ').Length)
{
    //Do something
}
于 2013-01-15T14:51:39.827 回答
0

不知道你想做什么,但你可以尝试这样的事情:

public bool allWordsContained(string input, string keyword)
{
    bool result = true;
    string[] words = keyword.Split(' ');

    foreach (var word in words)
    {
        if (!input.Contains(word))
            result = false;
    }

    return result;
}

public bool atLeastOneWordContained(string input, string keyword)
{
    bool result = false;
    string[] words = keyword.Split(' ');

    foreach (var word in words)
    {
        if (input.Contains(word))
            result = true;
    }

    return result;
}
于 2013-01-15T14:34:24.353 回答
0

这是解决方案。线索是获取结果列表并使 Distinct()...

    string keyword = "internet bandwidth";
    string input = "Test your Internet connection bandwidth. Test your Internet connection bandwidth.";

    Regex r = new Regex(keyword.Replace(' ', '|'), RegexOptions.IgnoreCase);
    MatchCollection mc = r.Matches(input);
    List<string> res = new List<string>();

    for (int i = 0; i < mc.Count;i++ )
    {
        res.Add(mc[i].Value);
    }

    if (res.Distinct().Count() == keyword.Split(' ').Length)
    {
        //Do something
    }
于 2013-01-15T14:35:00.860 回答