25

我有一个数字列表,例如 21,4,7,9,12,22,17,8,2,20,23

我希望能够挑选出序列号的序列(长度至少为 3 个项目),因此从上面的示例中它将是 7、8、9 和 20、21、22、23。

我玩过一些丑陋的杂乱无章的功能,但我想知道是否有一种简洁的 LINQ 方式来做到这一点。

有什么建议么?

更新:

非常感谢所有的回复,非常感谢。我目前正在和他们一起玩,看看哪个最适合我们的项目。

4

12 回答 12

28

令我震惊的是,您应该做的第一件事就是订购清单。然后只需遍历它,记住当前序列的长度并检测它何时结束。老实说,我怀疑一个简单的 foreach 循环将是最简单的方法——我无法立即想到任何非常简洁的类似 LINQ 的方法。如果您真的愿意,您当然可以在迭代器块中执行此操作,但请记住,从列表开始排序意味着无论如何您都有合理的“前期”成本。所以我的解决方案看起来像这样:

var ordered = list.OrderBy(x => x);
int count = 0;
int firstItem = 0; // Irrelevant to start with
foreach (int x in ordered)
{
    // First value in the ordered list: start of a sequence
    if (count == 0)
    {
        firstItem = x;
        count = 1;
    }
    // Skip duplicate values
    else if (x == firstItem + count - 1)
    {
        // No need to do anything
    }
    // New value contributes to sequence
    else if (x == firstItem + count)
    {
        count++;
    }
    // End of one sequence, start of another
    else
    {
        if (count >= 3)
        {
            Console.WriteLine("Found sequence of length {0} starting at {1}",
                              count, firstItem);
        }
        count = 1;
        firstItem = x;
    }
}
if (count >= 3)
{
    Console.WriteLine("Found sequence of length {0} starting at {1}",
                      count, firstItem);
}

编辑:好的,我刚刚想到了一种更类似于 LINQ 的做事方式。我现在没有时间完全实现它,但是:

  • 排序顺序
  • 使用类似SelectWithPrevious(可能更好命名SelectConsecutive)来获取连续的元素对
  • 使用包含索引的 Select 的重载来获取 (index, current, previous) 的元组
  • 过滤掉 (current = previous + 1) 的任何项目,以获得任何算作序列开头的项目(特殊情况下 index=0)
  • 在结果上使用SelectWithPrevious以获取两个起点之间的序列长度(从前一个索引中减去一个索引)
  • 过滤掉任何长度小于3的序列

怀疑您需要连接int.MinValue有序序列,以确保正确使用最终项目。

编辑:好的,我已经实现了这个。这是我能想到的 LINQiest 方式......我使用空值作为“哨兵”值来强制开始和结束序列 - 有关更多详细信息,请参阅评论。

总的来说,我不会推荐这个解决方案。很难理解,虽然我有理由相信它是正确的,但我花了一段时间才想到可能的错误等等。这是一次有趣的旅程,你可以用 LINQ 做什么......而且你可能不应该这样做。

哦,请注意,我已经将“最小长度为 3”的部分推给了调用者——当你有这样的元组序列时,单独过滤它会更干净,IMO。

using System;
using System.Collections.Generic;
using System.Linq;

static class Extensions
{
    public static IEnumerable<TResult> SelectConsecutive<TSource, TResult>
        (this IEnumerable<TSource> source,
         Func<TSource, TSource, TResult> selector)
    {
        using (IEnumerator<TSource> iterator = source.GetEnumerator())
        {
           if (!iterator.MoveNext())
           {
               yield break;
           }
           TSource prev = iterator.Current;
           while (iterator.MoveNext())
           {
               TSource current = iterator.Current;
               yield return selector(prev, current);
               prev = current;
           }
        }
    }
}

class Test
{
    static void Main()
    {
        var list = new List<int> {  21,4,7,9,12,22,17,8,2,20,23 };

        foreach (var sequence in FindSequences(list).Where(x => x.Item1 >= 3))
        {
            Console.WriteLine("Found sequence of length {0} starting at {1}",
                              sequence.Item1, sequence.Item2);
        }
    }

    private static readonly int?[] End = { null };

    // Each tuple in the returned sequence is (length, first element)
    public static IEnumerable<Tuple<int, int>> FindSequences
         (IEnumerable<int> input)
    {
        // Use null values at the start and end of the ordered sequence
        // so that the first pair always starts a new sequence starting
        // with the lowest actual element, and the final pair always
        // starts a new one starting with null. That "sequence at the end"
        // is used to compute the length of the *real* final element.
        return End.Concat(input.OrderBy(x => x)
                               .Select(x => (int?) x))
                  .Concat(End)
                  // Work out consecutive pairs of items
                  .SelectConsecutive((x, y) => Tuple.Create(x, y))
                  // Remove duplicates
                  .Where(z => z.Item1 != z.Item2)
                  // Keep the index so we can tell sequence length
                  .Select((z, index) => new { z, index })
                  // Find sequence starting points
                  .Where(both => both.z.Item2 != both.z.Item1 + 1)
                  .SelectConsecutive((start1, start2) => 
                       Tuple.Create(start2.index - start1.index, 
                                    start1.z.Item2.Value));
    }
}
于 2010-10-02T06:20:26.737 回答
14

Jon Skeet / Timwi 的解决方案是可行的方法。

为了好玩,这是一个完成这项工作的 LINQ 查询(非常低效):

var sequences = input.Distinct()
                     .GroupBy(num => Enumerable.Range(num, int.MaxValue - num + 1)
                                               .TakeWhile(input.Contains)
                                               .Last())  //use the last member of the consecutive sequence as the key
                     .Where(seq => seq.Count() >= 3)
                     .Select(seq => seq.OrderBy(num => num)); // not necessary unless ordering is desirable inside each sequence.

通过将输入加载到 aHashSet中(以提高Contains)可以稍微提高查询的性能,但这仍然不会产生接近有效的解决方案。

我知道的唯一错误是如果序列包含大量负数(我们无法表示 的count参数Range),则可能发生算术溢出。static IEnumerable<int> To(this int start, int end)使用自定义扩展方法很容易解决这个问题。如果有人能想到任何其他简单的躲避溢出的技术,请告诉我。

编辑:这是一个稍微冗长(但同样低效)的变体,没有溢出问题。

var sequences = input.GroupBy(num => input.Where(candidate => candidate >= num)
                                          .OrderBy(candidate => candidate)
                                          .TakeWhile((candidate, index) => candidate == num + index)
                                          .Last())
                     .Where(seq => seq.Count() >= 3)
                     .Select(seq => seq.OrderBy(num => num));
于 2010-10-02T10:17:56.057 回答
4

我认为我的解决方案更加优雅和简单,因此更容易验证是否正确:

/// <summary>Returns a collection containing all consecutive sequences of
/// integers in the input collection.</summary>
/// <param name="input">The collection of integers in which to find
/// consecutive sequences.</param>
/// <param name="minLength">Minimum length that a sequence should have
/// to be returned.</param>
static IEnumerable<IEnumerable<int>> ConsecutiveSequences(
    IEnumerable<int> input, int minLength = 1)
{
    var results = new List<List<int>>();
    foreach (var i in input.OrderBy(x => x))
    {
        var existing = results.FirstOrDefault(lst => lst.Last() + 1 == i);
        if (existing == null)
            results.Add(new List<int> { i });
        else
            existing.Add(i);
    }
    return minLength <= 1 ? results :
        results.Where(lst => lst.Count >= minLength);
}

与其他解决方案相比的优势:

  • 它可以找到重叠的序列。
  • 它可以正确地重复使用并记录在案。
  • 我没有发现任何错误 ;-)
于 2010-10-02T09:49:44.593 回答
2

以下是如何以“LINQish”方式解决问题:

int[] arr = new int[]{ 21, 4, 7, 9, 12, 22, 17, 8, 2, 20, 23 };
IOrderedEnumerable<int> sorted = arr.OrderBy(x => x);
int cnt = sorted.Count();
int[] sortedArr = sorted.ToArray();
IEnumerable<int> selected = sortedArr.Where((x, idx) =>
    idx <= cnt - 3 && sortedArr[idx + 1] == x + 1 && sortedArr[idx + 2] == x + 2);
IEnumerable<int> result = selected.SelectMany(x => new int[] { x, x + 1, x + 2 }).Distinct();

Console.WriteLine(string.Join(",", result.Select(x=>x.ToString()).ToArray()));

由于数组复制和重建,这个解决方案 - 当然 - 不如带有循环的传统解决方案有效。

于 2010-10-02T07:18:02.610 回答
1

不是 100% Linq,但这是一个通用变体:

static IEnumerable<IEnumerable<TItem>> GetSequences<TItem>(
        int minSequenceLength, 
        Func<TItem, TItem, bool> areSequential, 
        IEnumerable<TItem> items)
    where TItem : IComparable<TItem>
{
    items = items
        .OrderBy(n => n)
        .Distinct().ToArray();

    var lastSelected = default(TItem);

    var sequences =
        from startItem in items
        where startItem.Equals(items.First())
            || startItem.CompareTo(lastSelected) > 0
        let sequence =
            from item in items
            where item.Equals(startItem) || areSequential(lastSelected, item)
            select (lastSelected = item)
        where sequence.Count() >= minSequenceLength
        select sequence;

    return sequences;
}

static void UsageInt()
{
    var sequences = GetSequences(
            3,
            (a, b) => a + 1 == b,
            new[] { 21, 4, 7, 9, 12, 22, 17, 8, 2, 20, 23 });

    foreach (var sequence in sequences)
        Console.WriteLine(string.Join(", ", sequence.ToArray()));
}

static void UsageChar()
{
    var list = new List<char>(
        "abcdefghijklmnopqrstuvwxyz".ToCharArray());

    var sequences = GetSequences(
            3,
            (a, b) => (list.IndexOf(a) + 1 == list.IndexOf(b)),
            "PleaseBeGentleWithMe".ToLower().ToCharArray());

    foreach (var sequence in sequences)
        Console.WriteLine(string.Join(", ", sequence.ToArray()));
}
于 2010-10-02T17:49:38.700 回答
1

如何对数组进行排序然后创建另一个数组,该数组是每个元素与前一个元素之间的差异


sortedArray = 8, 9, 10, 21, 22, 23, 24, 27, 30, 31, 32
diffArray   =    1,  1, 11,  1,  1,  1,  3,  3,  1,  1
现在遍历差异数组;如果差等于 1,则将变量 sequenceLength 的计数增加 1。如果差 > 1,请检查 sequenceLength,如果它 >=2,那么您有一个至少包含 3 个连续元素的序列。然后将 sequenceLenght 重置为 0 并在差异数组上继续循环。

于 2010-10-02T18:44:41.613 回答
1

这是我的镜头:

public static class SequenceDetector
{
    public static IEnumerable<IEnumerable<T>> DetectSequenceWhere<T>(this IEnumerable<T> sequence, Func<T, T, bool> inSequenceSelector)
    {
        List<T> subsequence = null;
        // We can only have a sequence with 2 or more items
        T last = sequence.FirstOrDefault();
        foreach (var item in sequence.Skip(1))
        {
            if (inSequenceSelector(last, item))
            {
                // These form part of a sequence
                if (subsequence == null)
                {
                    subsequence = new List<T>();
                    subsequence.Add(last);
                }
                subsequence.Add(item);
            }
            else if (subsequence != null)
            {
                // We have a previous seq to return
                yield return subsequence;
                subsequence = null;
            }
            last = item;
        }
        if (subsequence != null)
        {
            // Return any trailing seq
            yield return subsequence;
        }
    }
}

public class test
{
    public static void run()
    {
        var list = new List<int> { 21, 4, 7, 9, 12, 22, 17, 8, 2, 20, 23 };
        foreach (var subsequence in list
            .OrderBy(i => i)
            .Distinct()
            .DetectSequenceWhere((first, second) => first + 1 == second)
            .Where(seq => seq.Count() >= 3))
        {
            Console.WriteLine("Found subsequence {0}", 
                string.Join(", ", subsequence.Select(i => i.ToString()).ToArray()));
        }
    }
}

这将返回形成子序列的特定项目,并允许任何类型的项目和任何标准定义,只要它可以通过比较相邻项目来确定。

于 2010-10-03T00:04:45.437 回答
0

这是我在 F# 中找到的解决方案,将其转换为 C# LINQ 查询应该相当容易,因为 fold 几乎等同于 LINQ 聚合运算符。

#light

let nums = [21;4;7;9;12;22;17;8;2;20;23]

let scanFunc (mainSeqLength, mainCounter, lastNum:int, subSequenceCounter:int, subSequence:'a list, foundSequences:'a list list) (num:'a) =
        (mainSeqLength, mainCounter + 1,
         num,
         (if num <> lastNum + 1 then 1 else subSequenceCounter+1),
         (if num <> lastNum + 1 then [num] else subSequence@[num]),
         if subSequenceCounter >= 3 then
            if mainSeqLength = mainCounter+1
                then foundSequences @ [subSequence@[num]]
            elif num <> lastNum + 1
                then foundSequences @ [subSequence]
            else foundSequences
         else foundSequences)

let subSequences = nums |> Seq.sort |> Seq.fold scanFunc (nums |> Seq.length, 0, 0, 0, [], []) |> fun (_,_,_,_,_,results) -> results
于 2010-10-02T19:46:43.193 回答
0

Linq 并不是万能的解决方案,有时你最好使用一个简单的循环。这是一个解决方案,只需一点 Linq 即可对原始序列进行排序并过滤结果

void Main()
{
    var numbers = new[] { 21,4,7,9,12,22,17,8,2,20,23 };
    var sequences =
        GetSequences(numbers, (prev, curr) => curr == prev + 1);
        .Where(s => s.Count() >= 3);
    sequences.Dump();
}

public static IEnumerable<IEnumerable<T>> GetSequences<T>(
    IEnumerable<T> source,
    Func<T, T, bool> areConsecutive)
{
    bool first = true;
    T prev = default(T);
    List<T> seq = new List<T>();
    foreach (var i in source.OrderBy(i => i))
    {
        if (!first && !areConsecutive(prev, i))
        {
            yield return seq.ToArray();
            seq.Clear();
        }
        first = false;
        seq.Add(i);
        prev = i;
    }
    if (seq.Any())
        yield return seq.ToArray();
}
于 2010-10-02T19:49:39.210 回答
0

我和Jon想到了同样的事情:要表示一系列连续整数,你真正需要的只是两个微不足道的整数!所以我会从那里开始:

struct Range : IEnumerable<int>
{
    readonly int _start;
    readonly int _count;

    public Range(int start, int count)
    {
        _start = start;
        _count = count;
    }

    public int Start
    {
        get { return _start; }
    }

    public int Count
    {
        get { return _count; }
    }

    public int End
    {
        get { return _start + _count - 1; }
    }

    public IEnumerator<int> GetEnumerator()
    {
        for (int i = 0; i < _count; ++i)
        {
            yield return _start + i;
        }
    }

    // Heck, why not?
    public static Range operator +(Range x, int y)
    {
        return new Range(x.Start, x.Count + y);
    }

    // skipping the explicit IEnumerable.GetEnumerator implementation
}

Range从那里,您可以编写一个静态方法来返回与序列的连续数字相对应的一堆这些值。

static IEnumerable<Range> FindRanges(IEnumerable<int> source, int minCount)
{
    // throw exceptions on invalid arguments, maybe...

    var ordered = source.OrderBy(x => x);

    Range r = default(Range);

    foreach (int value in ordered)
    {
        // In "real" code I would've overridden the Equals method
        // and overloaded the == operator to write something like
        // if (r == Range.Empty) here... but this works well enough
        // for now, since the only time r.Count will be 0 is on the
        // first item.
        if (r.Count == 0)
        {
            r = new Range(value, 1);
            continue;
        }

        if (value == r.End)
        {
            // skip duplicates
            continue;
        }
        else if (value == r.End + 1)
        {
            // "append" consecutive values to the range
            r += 1;
        }
        else
        {
            // return what we've got so far
            if (r.Count >= minCount)
            {
                yield return r;
            }

            // start over
            r = new Range(value, 1);
        }
    }

    // return whatever we ended up with
    if (r.Count >= minCount)
    {
        yield return r;
    }
}

演示:

int[] numbers = new[] { 21, 4, 7, 9, 12, 22, 17, 8, 2, 20, 23 };

foreach (Range r in FindConsecutiveRanges(numbers, 3))
{
    // Using .NET 3.5 here, don't have the much nicer string.Join overloads.
    Console.WriteLine(string.Join(", ", r.Select(x => x.ToString()).ToArray()));
}

输出:

7、8、9
20、21、22、23
于 2010-10-02T20:03:03.107 回答
0

这是我的 LINQ-y 对这个问题的看法:

static IEnumerable<IEnumerable<int>>
    ConsecutiveSequences(this IEnumerable<int> input, int minLength = 3)
{
    int order = 0;
    var inorder = new SortedSet<int>(input);
    return from item in new[] { new { order = 0, val = inorder.First() } }
               .Concat(
                 inorder.Zip(inorder.Skip(1), (x, val) =>
                         new { order = x + 1 == val ? order : ++order, val }))
           group item.val by item.order into list
           where list.Count() >= minLength
           select list;
}
  • 不使用显式循环,但仍应为 O(n lg n)
  • 使用SortedSet而不是.OrderBy().Distinct()
  • 将连续元素与list.Zip(list.Skip(1))
于 2010-10-02T20:57:24.107 回答
0

这是一个使用字典而不是排序的解决方案......它将项目添加到字典中,然后为每个值递增上下以找到最长的序列。
它不是严格意义上的 LINQ,尽管它确实使用了一些 LINQ 函数,而且我认为它比纯 LINQ 解决方案更具可读性。

static void Main(string[] args)
    {
        var items = new[] { -1, 0, 1, 21, -2, 4, 7, 9, 12, 22, 17, 8, 2, 20, 23 };
        IEnumerable<IEnumerable<int>> sequences = FindSequences(items, 3);

        foreach (var sequence in sequences)
        {   //print results to consol
            Console.Out.WriteLine(sequence.Select(num => num.ToString()).Aggregate((a, b) => a + "," + b));
        }
        Console.ReadLine();
    }

    private static IEnumerable<IEnumerable<int>> FindSequences(IEnumerable<int> items, int minSequenceLength)
    {
        //Convert item list to dictionary
        var itemDict = new Dictionary<int, int>();
        foreach (int val in items)
        {
            itemDict[val] = val;
        }
        var allSequences = new List<List<int>>();
        //for each val in items, find longest sequence including that value
        foreach (var item in items)
        {
            var sequence = FindLongestSequenceIncludingValue(itemDict, item);
            allSequences.Add(sequence);
            //remove items from dict to prevent duplicate sequences
            sequence.ForEach(i => itemDict.Remove(i));
        }
        //return only sequences longer than 3
        return allSequences.Where(sequence => sequence.Count >= minSequenceLength).ToList();
    }

    //Find sequence around start param value
    private static List<int> FindLongestSequenceIncludingValue(Dictionary<int, int> itemDict, int value)
    {
        var result = new List<int>();
        //check if num exists in dictionary
        if (!itemDict.ContainsKey(value))
            return result;

        //initialize sequence list
        result.Add(value);

        //find values greater than starting value
        //and add to end of sequence
        var indexUp = value + 1;
        while (itemDict.ContainsKey(indexUp))
        {
            result.Add(itemDict[indexUp]);
            indexUp++;
        }

        //find values lower than starting value 
        //and add to start of sequence
        var indexDown = value - 1;
        while (itemDict.ContainsKey(indexDown))
        {
            result.Insert(0, itemDict[indexDown]);
            indexDown--;
        }
        return result;
    }
于 2010-10-02T21:01:17.717 回答