如果您事先知道要从中随机选择的项目总数,则无需先转换为列表即可。
以下方法将为您完成:
/// <summary>Randomly selects items from a sequence.</summary>
/// <typeparam name="T">The type of the items in the sequence.</typeparam>
/// <param name="sequence">The sequence from which to randomly select items.</param>
/// <param name="count">The number of items to randomly select from the sequence.</param>
/// <param name="sequenceLength">The number of items in the sequence among which to randomly select.</param>
/// <param name="rng">The random number generator to use.</param>
/// <returns>A sequence of randomly selected items.</returns>
/// <remarks>This is an O(N) algorithm (N is the sequence length).</remarks>
public static IEnumerable<T> RandomlySelectedItems<T>(IEnumerable<T> sequence, int count, int sequenceLength, System.Random rng)
{
if (sequence == null)
{
throw new ArgumentNullException("sequence");
}
if (count < 0 || count > sequenceLength)
{
throw new ArgumentOutOfRangeException("count", count, "count must be between 0 and sequenceLength");
}
if (rng == null)
{
throw new ArgumentNullException("rng");
}
int available = sequenceLength;
int remaining = count;
var iterator = sequence.GetEnumerator();
for (int current = 0; current < sequenceLength; ++current)
{
iterator.MoveNext();
if (rng.NextDouble() < remaining/(double)available)
{
yield return iterator.Current;
--remaining;
}
--available;
}
}
(这里的关键是在开始时需要知道可供选择的项目数量;这确实会在一定程度上降低实用性。但是如果快速获取计数并且缓冲所有项目会占用太多内存,这是一个有用的解决方案.)
这是另一种使用水库采样的方法
这种方法不需要知道可供选择的项目总数,但它确实需要缓冲输出。当然,它还需要枚举整个输入集合。
因此,这仅在您事先不知道可供选择的项目数量(或可供选择的项目数量非常多)时才有用。
我建议按照 Henk 的回答对列表进行改组,而不是这样做,但为了感兴趣,我将其包括在此处:
// n is the number of items to randomly choose.
public static List<T> RandomlyChooseItems<T>(IEnumerable<T> items, int n, Random rng)
{
var result = new List<T>(n);
int index = 0;
foreach (var item in items)
{
if (index < n)
{
result.Add(item);
}
else
{
int r = rng.Next(0, index + 1);
if (r < n)
result[r] = item;
}
++index;
}
return result;
}
作为 Henk 答案的附录,这是他提到的 Shuffle 算法的规范实现。在此,_rng
是一个实例Random
:
/// <summary>Shuffles the specified array.</summary>
/// <typeparam name="T">The type of the array elements.</typeparam>
/// <param name="array">The array to shuffle.</param>
public void Shuffle<T>(IList<T> array)
{
for (int n = array.Count; n > 1;)
{
int k = _rng.Next(n);
--n;
T temp = array[n];
array[n] = array[k];
array[k] = temp;
}
}