2

这是我在 List 中随机排序第一个 N 元素的代码:

int upper = 1;
if (myList.Count > 1)
{
    Random r = new Random();
    upper = Math.Min(maxNumberOfPackets, myList.Count);
    for (int i = 0; i < upper; i++)
    {
        int randInd = r.Next(i, myList.Count);
        var temp = myList[i];
        myList[i] = myList[randInd];
        myList[randInd] = temp;
    }
}

好吧,现在我有“必要性”只使用 Enumerable(出于某种原因;所以我不想将它转换为 Enumerable)。

据您所知,我可以对 Enumerable 做同样的事情吗?我认为痛苦是在 X 位置访问元素......

我很好奇...

4

6 回答 6

6

对于IEnumerable[<T>]源,您通常应该只假设它们可以被读取一次,当然在您使用它之前您无法可靠地知道长度。随机化它通常需要缓冲。你也可以这样做:

var list = source.ToList();

并且只需使用您现有的列表随机代码。IEnumerable[<T>]如果您想保持输入和输出相似,您当然可以再次返回该列表。

如果你真的想,你可以从头开始:

static IEnumerable<T> ScrambleStart(this IEnumerable<T> source,
        int numberToScramble)
{
  using(var iter = source.GetEnumerator()) {
    List<T> list = new List<T>();
    while(iter.MoveNext()) {
        list.Add(iter.Current);
        if(list.Count == numberToScramble) break;
    }
    ScrambleList(list); // your existing code
    foreach(var item in list) yield return item;
    while(iter.MoveNext()) {
        yield return iter.Current;
    }
  }
}
于 2012-06-11T09:27:55.007 回答
3

IEnumerable表示“仅向前游标”,它可能返回流式传输的数据(例如来自数据库)。因此,为了能够订购一个可枚举(无论它是否是随机的),确实意味着您需要缓存它,仅仅是因为您需要拥有所有值才能确定排序。也就是说,您可以Enumerable.OrderBy为此使用运算符,但同样,它会在幕后为您进行缓存。

var r = new Random();

IEnumerable sorted = myList.OrderBy(i => r.Next());
于 2012-06-11T09:28:38.707 回答
2

您应该使用Reservoir sampling

于 2012-06-11T09:35:10.963 回答
1

下面的实现将从原始输入产生一个(伪)随机序列。(除了 Random 是伪随机之外)伪性是基于猜测序列的长度,这在大多数情况下当然是错误的。对于长度小于 maxJump 长度的序列,它将是随机的。如果事实证明元素的数量是 maxJump 值的一半或更多,它将增加以允许更高程度的随机性。

public IEnumerable<T> Randomize(this IEnumerable<T> source,int maxJump = 1000){
    var cache = new List<T>();
    var r = new Random();
    var enumerator = source.GetEnumerator();
    var totalCount = 1;
    while(enumerator.MoveNext()){
       var next = r.Next(0,maxJump);
       if(next < cache.Count){
          //the element we are searching for is in the cache
          yield return cache[next];
          cache.RemoveAt(next);
       } else {
         next = next - cache.Count;
         do{
          totalCount++;
          if(next == 0){
             //we've found the next element so yield
             yield return enumerator.Current;
          } else {
             //not the element we are looking for so cache
             cache.Insert(r.Next(0,cache.Count),enumerator.Current);
          }
          --next;
        }while(next > 0 && enumerator.MoveNext());
        //if we've reached half of the maxJump length
        //increase the max to allow for higher randomness
        if("*totalCount == maxJump){
            maxJump *= 2;
        }
      }
    }
    //Yield the elements in the cache
    //they are already randomized
    foreach(var elem in cache){
          yield return elem;
    }
}
于 2012-06-11T10:01:55.380 回答
1

您总是必须以IEnumerable某种方式消耗您的随机化它,因此您可以调用.ToList()它并使用您的排序方法。

考虑一个IEnumerable可以有无限数量的项目的事实。

于 2012-06-11T09:30:03.560 回答
0

这可行,但对于要从中随机选择相对少量元素的大型序列来说效率极低(因为它遍历源序列的每个元素)。

/// <summary>Randomly selects items from a sequence.</summary>
/// <typeparam name="T">The type of the items in the sequence.</typeparam>
/// <param name="sequence">The sequence from which to randomly select items.</param>
/// <param name="count">The number of items to randomly select from the sequence.</param>
/// <param name="sequenceLength">The number of items in the sequence among which to randomly select.</param>
/// <param name="rng">The random number generator to use.</param>
/// <returns>A sequence of randomly selected items.</returns>
/// <remarks>This is an O(N) algorithm (N is the sequence length).</remarks>

public static IEnumerable<T> RandomlySelectedItems<T>(IEnumerable<T> sequence, int count, int sequenceLength, System.Random rng)
{
    if (sequence == null)
    {
        throw new ArgumentNullException("sequence");
    }

    if (count < 0 || count > sequenceLength)
    {
        throw new ArgumentOutOfRangeException("count", count, "count must be between 0 and sequenceLength");
    }

    if (rng == null)
    {
        throw new ArgumentNullException("rng");
    }

    int available = sequenceLength;
    int remaining = count;
    var iterator  = sequence.GetEnumerator();

    for (int current = 0; current < sequenceLength; ++current)
    {
        iterator.MoveNext();

        if (rng.NextDouble() < remaining/(double)available)
        {
            yield return iterator.Current;
            --remaining;
        }

        --available;
    }
}
于 2012-06-11T09:45:35.890 回答