我正在编写一些使用 Python 进行排序算法比较的程序。我想测量平均排序时间。第一次测量有问题。
这:
for i in xrange(self.repeats):
# random list generator
data_orig = [random.randint(0, self.size - 1) for x in xrange(self.size)]
sorter = self.class_()
data = data_orig[:]
debug("%s for data size: %d, try #%d" % (sorter.__class__.__name__, self.size, i+1))
t1 = time.clock()
sorter.sort(data)
t2 = time.clock()
debug("Took: %0.4fms, shifts: %d, comparisons: %d" % ((t2-t1)*1000.0, sorter.shifts, sorter.comps))
class_
是对 InsertionSort 类的引用。对于 size = 1000 和 5 次重复,我得到以下结果:
InsertionSort for data size: 1000, try #1
Took: 39.5341ms, shifts: 254340, comparisons: 255331
InsertionSort for data size: 1000, try #2
Took: 6.0765ms, shifts: 250778, comparisons: 251772
InsertionSort for data size: 1000, try #3
Took: 6.9946ms, shifts: 254189, comparisons: 255180
InsertionSort for data size: 1000, try #4
Took: 6.7421ms, shifts: 252162, comparisons: 253156
InsertionSort for data size: 1000, try #5
Took: 5.9584ms, shifts: 241412, comparisons: 242404
对于每种排序算法,每次我运行程序时,第一个结果都比其他结果大。我用 PyPy 运行它(用 Python 看起来不错,但速度要慢得多)。
我知道我可以简单地省略第一个结果,但这个解决方案并不让我满意 :-)
有任何想法吗?