1

我正在做一个类似标签的字符串匹配函数,其中函数检查字符串包含任何可能的单词,同时保持它们的顺序,至少每个标签。我发现最好预先创建可能性列表并在检查时简单地查看字符串是否包含每个所需的组合

也许代码会让它更清晰。

List<List<string[]>> tags;

List<string[]> innerList;

List<List<string>> combinationsList;

public void Generate(string pattern)
{
    // i will add whitespace removal later so it can be ", " instead of only ","

    foreach (string tag in pattern.Split(','))
    {
        innerList = new List<string[]>();

        foreach (var varyword in tag.Split(' '))
        {
            innerList.Add(varyword.Split('|'));
        }
    }

    // atm i lack code to generate combinations in form of List<List<string>> 
    // and drop them into 'combinationsList'
}

// the check function will look something like isMatch = :
public bool IsMatch(string textToTest)
{
    return combinationsList.All(tc => tc.Any(c => textToTest.Contains(c)));
}

因此,例如模式:

“老|年轻的约翰|鲍勃,有|拥有狗|猫”

  • 标签:
    • 列表_1:
      • {老,年轻}
      • {约翰,鲍勃}
    • 列表_2
      • {拥有,拥有}
      • {狗猫}

所以combinationsList将有:

  • 组合列表:
    • 列表_1
      • “老约翰”
      • “老鲍勃”
      • “小约翰”
      • “年轻的鲍勃”
    • 列表_2
      • “养狗”
      • “养猫”
      • “养狗”
      • “拥有猫”

所以结果将是:

  • old bob have cat = true,包含 List_1:"old bob" 和 List_2:"have cat"
  • 年轻的约翰有 car = false,包含 List_1:"young john" 但不包含任何 List_2 组合

我无法弄清楚如何迭代集合以获取这些组合以及如何在每次迭代中获取组合。我也不能把订单搞乱,所以老约翰也不会像老约翰那样生成。

请注意,模式中的任何“变体词”都可能有两个以上的变体,例如“dog|cat|mouse”

4

2 回答 2

2

此代码可能会有所帮助

string pattern = "old|young john|bob have|posses dog|cat";
var lists = pattern.Split(' ').Select(p => p.Split('|'));

foreach (var line in CartesianProduct(lists))
{
    Console.WriteLine(String.Join(" ",line));
}


//http://blogs.msdn.com/b/ericlippert/archive/2010/06/28/computing-a-cartesian-product-with-linq.aspx
static IEnumerable<IEnumerable<T>> CartesianProduct<T>(IEnumerable<IEnumerable<T>> sequences)
{
    // base case:
    IEnumerable<IEnumerable<T>> result = new[] { Enumerable.Empty<T>() };
    foreach (var sequence in sequences)
    {
        var s = sequence; // don't close over the loop variable
        // recursive case: use SelectMany to build the new product out of the old one
        result =
            from seq in result
            from item in s
            select seq.Concat(new[] { item });
    }
    return result;
}
于 2013-04-26T14:23:59.260 回答
0

我在另一个线程中找到了答案。

https://stackoverflow.com/a/11110641/1156272

亚当发布的代码完美无缺,完全符合我的需要

        foreach (var tag in pattern.Split(','))
        {
            string tg = tag;
            while (tg.StartsWith(" ")) tg = tg.Remove(0,1);
            innerList = new List<List<string>>();

            foreach (var varyword in tg.Split(' '))
            {
                innerList.Add(varyword.Split('|').ToList<string>());
            }

            //Adam's code

            List<String> combinations = new List<String>();
            int n = innerList.Count;
            int[] counter = new int[n];
            int[] max = new int[n];
            int combinationsCount = 1;
            for (int i = 0; i < n; i++)
            {
                max[i] = innerList[i].Count;
                combinationsCount *= max[i];
            }
            int nMinus1 = n - 1;
            for (int j = combinationsCount; j > 0; j--)
            {
                StringBuilder builder = new StringBuilder();
                for (int i = 0; i < n; i++)
                {
                    builder.Append(innerList[i][counter[i]]);
                    if (i < n - 1) builder.Append(" "); //my addition to insert whitespace between words
                }
                combinations.Add(builder.ToString());

                counter[nMinus1]++;
                for (int i = nMinus1; i >= 0; i--)
                {
                    // overflow check
                    if (counter[i] == max[i])
                    {
                        if (i > 0)
                        {
                            // carry to the left
                            counter[i] = 0;
                            counter[i - 1]++;
                        }
                    }
                }
            }

            //end

            if(combinations.Count > 0)
                combinationsList.Add(combinations);
        }
    }

    public bool IsMatch(string textToCheck)
    {
        if (combinationsList.Count == 0) return true;

        string t = _caseSensitive ? textToCheck : textToCheck.ToLower();

        return combinationsList.All(tg => tg.Any(c => t.Contains(c)));
    }

看起来像魔术,但它有效。感谢大家

于 2013-04-26T21:53:18.880 回答