4

我的应用程序涉及繁重的数组操作(例如log(1)索引),因此Data.VectorData.Vector.Unboxed优于Data.List。它还涉及许多集合操作(​​例如 intersectBy),但是 Data.Vector 不提供这些操作。

这些功能中的每一个都可以像在 Data.List 中一样在 3-4 行中实现。有什么理由他们都没有用 Data.Vector 实现吗?我只能推测。也许出于性能原因不鼓励在 Data.Vector 中设置操作,即 intersectBy 会首先通过列表理解产生交集,然后将列表转换为 Data.Vector?

4

1 回答 1

7

I assume it's missing because intersection of unsorted, immutable arrays must have a worst-case run time of Ω(n*m) without using additional space and Data.Vector is optimized for performance. If you want, you can write that function yourself, though:

import Data.Vector as V

intersect :: Eq a => V.Vector a -> V.Vector a -> V.Vector a
intersect x = V.filter (`V.elem` x)

Or by using a temporary set data structure to achieve an expected O(n + m) complexity:

import Data.HashSet as HS

intersect :: (Hashable a, Eq a) => V.Vector a -> V.Vector a -> V.Vector a
intersect x = V.filter (`HS.member` set)
    where set = HS.fromList $ V.toList x

If you can afford the extra memory usage, maybe you can use some kind of aggregate type for your data, for example an array for fast random access and a hash trie like Data.HashSet for fast membership checks and always keep both containers up to date. That way you can reduce the asymptotic complexity for intersection to something like O(min(n, m))

于 2013-03-29T19:05:18.243 回答