我已经开始尝试创建以下内容:
public static IEnumerable<List<T>> OptimizedBatches<T>(this IEnumerable<T> items)
那么这个扩展方法的客户端会像这样使用它:
foreach (var list in extracter.EnumerateAll().OptimizedBatches())
{
// at some unknown batch size, process time starts to
// increase at an exponential rate
}
这是一个例子:
batch length time
1 100ms
2 102ms
4 110ms
8 111ms
16 118ms
32 119ms
64 134ms
128 500ms <-- doubled length but time it took more than doubled
256 1100ms <-- oh no!!
从上面可以看出,最好的批长度是 64,因为 64/134 是最好的长度/时间比。
所以问题是使用什么算法来根据迭代器步骤之间的连续时间自动选择最佳批次长度?
这是我到目前为止所拥有的 - 它还没有完成......
class LengthOptimizer
{
private Stopwatch sw;
private int length = 1;
private List<RateRecord> rateRecords = new List<RateRecord>();
public int Length
{
get
{
if (sw == null)
{
length = 1;
sw = new Stopwatch();
}
else
{
sw.Stop();
rateRecords.Add(new RateRecord { Length = length, ElapsedMilliseconds = sw.ElapsedMilliseconds });
length = rateRecords.OrderByDescending(c => c.Rate).First().Length;
}
sw.Start();
return length;
}
}
}
struct RateRecord
{
public int Length { get; set; }
public long ElapsedMilliseconds { get; set; }
public float Rate { get { return ((float)Length) / ElapsedMilliseconds; } }
}