在阅读了 Stack Overflow 问题Using vector for performance Improvement in Haskell描述了 Haskell 中的快速就地快速排序后,我为自己设定了两个目标:
实现中位数为 3 的相同算法,以避免在预排序向量上表现不佳;
制作并行版本。
这是结果(为简单起见,留下了一些小片段):
import qualified Data.Vector.Unboxed.Mutable as MV
import qualified Data.Vector.Generic.Mutable as GM
type Vector = MV.IOVector Int
type Sort = Vector -> IO ()
medianofthreepartition :: Vector -> Int -> IO Int
medianofthreepartition uv li = do
p1 <- MV.unsafeRead uv li
p2 <- MV.unsafeRead uv $ li `div` 2
p3 <- MV.unsafeRead uv 0
let p = median p1 p2 p3
GM.unstablePartition (< p) uv
vquicksort :: Sort
vquicksort uv = do
let li = MV.length uv - 1
j <- medianofthreepartition uv li
when (j > 1) (vquicksort (MV.unsafeSlice 0 j uv))
when (j + 1 < li) (vquicksort (MV.unsafeSlice (j+1) (li-j) uv))
vparquicksort :: Sort
vparquicksort uv = do
let li = MV.length uv - 1
j <- medianofthreepartition uv li
t1 <- tryfork (j > 1) (vparquicksort (MV.unsafeSlice 0 j uv))
t2 <- tryfork (j + 1 < li) (vparquicksort (MV.unsafeSlice (j+1) (li-j) uv))
wait t1
wait t2
tryfork :: Bool -> IO () -> IO (Maybe (MVar ()))
tryfork False _ = return Nothing
tryfork True action = do
done <- newEmptyMVar :: IO (MVar ())
_ <- forkFinally action (\_ -> putMVar done ())
return $ Just done
wait :: Maybe (MVar ()) -> IO ()
wait Nothing = return ()
wait (Just done) = swapMVar done ()
median :: Int -> Int -> Int -> Int
median a b c
| a > b =
if b > c then b
else if a > c then c
else a
| otherwise =
if a > c then a
else if b > c then c
else b
对于具有 1,000,000 个元素的向量,我得到以下结果:
"Number of threads: 4"
"**** Parallel ****"
"Testing sort with length: 1000000"
"Creating vector"
"Printing vector"
"Sorting random vector"
CPU time: 12.30 s
"Sorting ordered vector"
CPU time: 9.44 s
"**** Single thread ****"
"Testing sort with length: 1000000"
"Creating vector"
"Printing vector"
"Sorting random vector"
CPU time: 0.27 s
"Sorting ordered vector"
CPU time: 0.39 s
我的问题是:
- 为什么预排序向量的性能仍然下降?
- 为什么使用 forkIO 和四线程无法提高性能?