0

I am new to Linq and am trying to convert one of my SQL queries to C# to achieve this. Let us say I have the following set of strings:

ABC-pqr-cv3-xa
LKJ-eqq-cb2-ya
POI-qqq-aaa-1
ABC-pqr-cv3-xb
UIO-qqq-xa
LKJ-eqq-cb2-za
POI-qqq-aaa-2
UIO-qqq-xb
LKJ-eqq-cb2-yb
POI-qqq-aaa-3

I want to group these strings based on whether the entire string matches except the last character. Therefore, following is the output I am expecting:

ABC-pqr-cv3-xa -- 1
ABC-pqr-cv3-xb -- 1

LKJ-eqq-cb2-ya -- 2
LKJ-eqq-cb2-yb -- 2

UIO-qqq-xa -- 3
UIO-qqq-xb -- 3

POI-qqq-aaa-1 -- 4
POI-qqq-aaa-2 -- 4
POI-qqq-aaa-3 -- 4

LKJ-eqq-cb2-za -- 5

Doing this naively would require O(n^2) comparisons. Is there a better way to achieve this? The group numbering itself is not of concern. I am currently trying this and will post an answer if I figure out an efficient way.

4

2 回答 2

5
myLotsOfStrings.GroupBy(item => item.Substring(-1))

这将产生一个IEnumerable<IGrouping<string,string>>,其中IGrouping<string,string>是组中的一个IEnumerable<string>项目。

它是使用一个ILookupwhich 构建的,它在创建时只迭代一次源并构建一个类似字典的结构,每个键允许多个值。它可能会尽可能高效......更像O(N)。

鉴于您在下面的评论中列出的限制,您可能需要一个正则表达式来修剪您的组键。

正则表达式:

(^.*-(?=\d+$))|(^.*-[^-]*(?=[^-]$))

将匹配POI-qqq-aaa-forPOI-qqq-aaa-123POI-qqq-aaa-xvfor POI-qqq-aaa-xva

所以把它们放在一起...

var regex = new Regex(@"(^.*-(?=\d+$))|(^.*-[^-]*(?=[^-]$))");
myLotsOfStrings
    //for each item, create anonymous object with 2 props, the original item
    //and the Match that is returned by running the regex over the item
    .Select(item => new{item, match = regex.Match(item)})
    //for each anonymous object (with properties item and match)
    //filter out any items where the regex failed to match 
    //(e.g match.Success is not true)
    .Where(x => x.match.Success)
    //now create an IEnumerable<IGrouping<string,string>>
    //where the value of the (successful) match is used for the key (match.value)
    //and the item in the group is the item property of the anonymous
    //object created above
    .GroupBy(x => x.match.Value, x => x.item)

似乎可以解决问题。

于 2013-11-01T03:20:45.350 回答
1

我真的很喜欢花费者回答的简洁性,但我想我会使用更多的 SQL linq 语法添加一些更长的东西(因为这是你所熟悉的)。大部分是设置和输出:-)

        var d =
@"ABC-pqr-cv3-xa
LKJ-eqq-cb2-ya
POI-qqq-aaa-1
ABC-pqr-cv3-xb
UIO-qqq-xa
LKJ-eqq-cb2-za
POI-qqq-aaa-2
UIO-qqq-xb
LKJ-eqq-cb2-yb
POI-qqq-aaa-3";
        var lines = d.Split(new string[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries);
        var grp = from line in lines
                  group line by line.Substring(0, line.Length - 1) into g
                  select g;
        int i = 1;
        foreach (var g in grp) {
            Console.WriteLine(i++);
            foreach (var s in g) {
                Console.WriteLine("\t{0}", s);
                }
            }
于 2013-11-01T03:41:49.617 回答