让我们做一些适当的基准测试。这是一些代码,您的 shuffle 重命名为shuffle1
,而我个人最喜欢的变体为shuffle2
.
import System.Random
import Control.Monad
import Control.Monad.ST.Strict
import Data.STRef.Strict
import Data.Vector.Mutable
import Prelude as P
import Criterion.Main
shuffle1 :: RandomGen g => [a] -> g -> ([a], g)
shuffle1 [] g0 = ([],g0)
shuffle1 [x] g0 = ([x],g0)
shuffle1 xs g0 = (x:newtail,g2)
where (i,g1) = randomR (0, P.length $ P.tail xs) g0
(xs1,x:xs2) = P.splitAt i xs
(newtail,g2) = shuffle1 (xs1++xs2) g1
shuffle2 :: RandomGen g => [a] -> g -> ([a], g)
shuffle2 xs g0 = runST $ do
let l = P.length xs
v <- new l
sequence_ $ zipWith (unsafeWrite v) [0..] xs
let loop g i | i <= 1 = return g
| otherwise = do
let i' = i - 1
(j, g') = randomR (0, i') g
unsafeSwap v i' j
loop g' i'
gFinal <- loop g0 l
shuffled <- mapM (unsafeRead v) [0 .. l - 1]
return (shuffled, gFinal)
main = do
let s1 x = fst $ shuffle1 x g0
s2 x = fst $ shuffle2 x g0
arr = [0..1000] :: [Int]
g0 = mkStdGen 0
-- make sure these values are evaluated before the benchmark starts
print (g0, arr)
defaultMain [bench "shuffle1" $ nf s1 arr, bench "shuffle2" $ nf s2 arr]
所以,让我们看看一些结果:
carl@ubuntu:~/hask$ ghc -O2 shuffle.hs
[1 of 1] Compiling Main ( shuffle.hs, shuffle.o )
Linking shuffle ...
carl@ubuntu:~/hask$ ./shuffle
(1 1,[0, .. <redacted for brevity>])
warming up
estimating clock resolution...
mean is 5.762060 us (160001 iterations)
found 4887 outliers among 159999 samples (3.1%)
4751 (3.0%) high severe
estimating cost of a clock call...
mean is 42.13314 ns (43 iterations)
benchmarking shuffle1
mean: 10.95922 ms, lb 10.92317 ms, ub 10.99903 ms, ci 0.950
std dev: 193.8795 us, lb 168.6842 us, ub 244.6648 us, ci 0.950
found 1 outliers among 100 samples (1.0%)
variance introduced by outliers: 10.396%
variance is moderately inflated by outliers
benchmarking shuffle2
mean: 256.9394 us, lb 255.5414 us, ub 258.7409 us, ci 0.950
std dev: 8.042766 us, lb 6.460785 us, ub 12.28447 us, ci 0.950
found 1 outliers among 100 samples (1.0%)
1 (1.0%) high severe
variance introduced by outliers: 26.750%
variance is moderately inflated by outliers
好的,我的系统真的很吵,不应该用于对具有相似数字的事物进行严肃的基准测试。但这在这里几乎不重要。shuffle2
比包含 1001 个元素的列表快大约 40 倍shuffle1
。由于 O(n) 和 O(n^2) 之间的差异,这只会随着更大的列表而增加。我敢肯定,无论您的测试代码是什么时间,它都不是随机播放算法。
其实,我有一个猜测。您的版本足够懒惰,可以逐步返回结果。5 秒是获得前几个结果的合理时间段,如果您在调用生成器后从未接触过它。也许这就是你的时间正在发生的事情。