在我的kdtree
项目中,我只是将深度计数器从 -based 替换为基于in类型Int
的显式计数器。这是差异。Key a
a
KDTree v a
现在虽然我认为这应该是类型级别的更改,但我的基准测试显示性能急剧下降:
前:
benchmarking nr/kdtree_nr
mean: 60.19084 us, lb 59.87414 us, ub 60.57270 us, ci 0.950
std dev: 1.777527 us, lb 1.494657 us, ub 2.120168 us, ci 0.950
后:
benchmarking nr/kdtree_nr
mean: 556.9518 us, lb 554.0586 us, ub 560.6128 us, ci 0.950
std dev: 16.70620 us, lb 13.58185 us, ub 20.63450 us, ci 0.950
在我深入核心之前......有人知道这里发生了什么吗?
编辑 1
正如 Thomas(和 userxyz)所建议的那样,我相应地替换data Key a :: *
并type Key a :: *
更改了实现。这对结果没有任何显着影响:
benchmarking nr/kdtree_nr
mean: 538.2789 us, lb 537.5128 us, ub 539.4408 us, ci 0.950
std dev: 4.745118 us, lb 3.454081 us, ub 6.969091 us, ci 0.950
编辑 2
刚刚快速浏览了核心输出。显然,更改会根据要专门化的类阻止功能,对吗?
前:
lvl20 :: KDTree Vector (V3 Double) -> [V3 Double]
lvl20 =
\ (w4 :: KDTree Vector (V3 Double)) ->
$wpointsAround $fKDCompareV3_$s$fKDCompareV3 lvl2 lvl4 nrRadius q w4
后:
lvl18 :: KDTree Vector (V3 Double) -> [V3 Double]
lvl18 =
\ (w4 :: KDTree Vector (V3 Double)) ->
$wpointsAround $dKDCompare lvl1 lvl3 nrRadius q w4
编辑 2的小更新:使用INLINE 编译指示发疯并不会改变任何事情。
编辑 3
快速实现userxyz 的建议:http ://lpaste.net/104457 以前去过那里,不能让它工作:
src/Data/KDTree.hs:48:49:
Could not deduce (k ~ KeyV3)
from the context (Real a, Floating a)
bound by the instance declaration at src/Data/KDTree.hs:45:10-49
or from (Key k)
bound by the type signature for
dimDistance :: Key k => k -> V3 a -> V3 a -> Double
at src/Data/KDTree.hs:47:3-13
‘k’ is a rigid type variable bound by
the type signature for
dimDistance :: Key k => k -> V3 a -> V3 a -> Double
at src/Data/KDTree.hs:47:3
Relevant bindings include
k :: k (bound at src/Data/KDTree.hs:47:15)
dimDistance :: k -> V3 a -> V3 a -> Double
(bound at src/Data/KDTree.hs:47:3)
In the pattern: V3X
In a case alternative: V3X -> ax - bx
In the second argument of ‘($)’, namely
‘case k of {
V3X -> ax - bx
V3Y -> ay - by
V3Z -> az - bz }’
编辑 4
嗯......我想我只是通过在函数中抛出SPECIALIZE pragma 来“解决”这个问题。这实际上导致所有内容都被内联并删除显式字典传递。
我对这个解决方案不太满意,因为这意味着我必须在文档中放置一个很大的“请专业化你的调用以实现良好的性能”警告。