这是sclv提出的两遍解决方案的实现。
import qualified Data.ByteString.Lazy as L
import Data.Binary.Put
import Data.Word
import Data.List (foldl')
data Tree = Node [Tree] | Leaf [Word32] deriving Show
makeTree 0 = Leaf $ replicate 100 0xdeadbeef
makeTree n = Node $ replicate 4 $ makeTree $ n-1
SizeTree 模仿原始树,它不包含数据,但在每个节点它存储相应子树的大小。
我们需要在内存中拥有 SizeTree,因此值得让它更紧凑(例如用 uboxed 单词替换 Ints)。
data SizeTree
= SNode {sz :: Int, chld :: [SizeTree]}
| SLeaf {sz :: Int}
deriving Show
在内存中使用 SizeTree 可以以流方式序列化原始 Tree。
putTree :: Tree -> SizeTree -> Put
putTree (Node xs) (SNode _ ys) = do
putWord8 $ fromIntegral $ length xs -- number of children
mapM_ (putWord32be . fromIntegral . sz) ys -- sizes of children
sequence_ [putTree x y | (x,y) <- zip xs ys] -- children data
putTree (Leaf xs) _ = do
putWord8 0 -- zero means 'leaf'
putWord32be $ fromIntegral $ length xs -- data length
mapM_ putWord32be xs -- leaf data
mkSizeTree :: Tree -> SizeTree
mkSizeTree (Leaf xs) = SLeaf (1 + 4 + 4 * length xs)
mkSizeTree (Node xs) = SNode (1 + 4 * length xs + sum' (map sz ys)) ys
where
ys = map mkSizeTree xs
sum' = foldl' (+) 0
防止 GHC 将两个通道合并为一个非常重要(在这种情况下,它将把树保存在内存中)。在这里,它不是通过向函数提供树而是树生成器来完成的。
serialize mkTree size = runPut $ putTree (mkTree size) treeSize
where
treeSize = mkSizeTree $ mkTree size
main = L.writeFile "dump.bin" $ serialize makeTree 10