11

我需要能够给出 SHA512 哈希的十六进制表示。也许我只是看起来不够努力,但我可以在 Hackage 上找到任何功能来做到这一点。所以我写了一个使用unfoldrN. 对于我的目的来说,它绝对足够快,但我想知道是否有人知道更快的方法。

我已将我的实现放在 Github 上作为要点:https ://gist.github.com/2356925 。该文件还包括一个基于Numeric.showHexQuickCheck 测试和标准基准的简单实现。我目前的简单版本与unfoldrN版本的结果是:

benchmarking simple
mean: 4.677296 ms, lb 4.656011 ms, ub 4.696684 ms, ci 0.950
std dev: 104.2791 us, lb 87.77023 us, ub 128.1627 us, ci 0.950
found 5 outliers among 100 samples (5.0%)
  4 (4.0%) low mild
variance introduced by outliers: 15.195%
variance is moderately inflated by outliers

benchmarking unfoldrN_MS1
mean: 370.0101 us, lb 365.9819 us, ub 373.8619 us, ci 0.950
std dev: 20.17016 us, lb 16.92772 us, ub 24.08982 us, ci 0.950
found 14 outliers among 100 samples (14.0%)
  7 (7.0%) low mild
  7 (7.0%) high mild
variance introduced by outliers: 52.467%
variance is severely inflated by outliers

有人想尝试改进它吗?

4

3 回答 3

8

往下走,

import Data.ByteString.Internal
import Foreign.Ptr
import Foreign.Storable
import qualified Data.ByteString as B
import Data.ByteString.Unsafe
import Data.Bits
import Data.Word

maxLen :: Int
maxLen = maxBound `quot` 2

hexDig :: Word8 -> Word8
hexDig d
    | d < 10    = d + 48
    | otherwise = d + 87

toHex :: ByteString -> ByteString
toHex bs
    | len > maxLen = error "too long to convert"
    | otherwise    = unsafeCreate nl (go 0)
      where
        len = B.length bs
        nl  = 2*len
        go i p
            | i == len  = return ()
            | otherwise = case unsafeIndex bs i of
                            w -> do poke p (hexDig $ w `shiftR` 4)
                                    poke (p `plusPtr` 1) (hexDig $ w .&. 0xF)
                                    go (i+1) (p `plusPtr` 2)

我可以将盒子上的转换时间再减少大约 3.5 倍。再sample长一点(25000),我得到了

benchmarking simple
mean: 13.76532 ms, lb 13.64184 ms, ub 13.88680 ms, ci 0.950
std dev: 633.2413 us, lb 582.6342 us, ub 687.9701 us, ci 0.950
variance introduced by outliers: 44.438%
variance is moderately inflated by outliers

benchmarking unfoldrN_MS1
mean: 430.5705 us, lb 424.9206 us, ub 438.5689 us, ci 0.950
std dev: 33.85429 us, lb 26.25623 us, ub 45.74915 us, ci 0.950
found 4 outliers among 100 samples (4.0%)
  3 (3.0%) high mild
  1 (1.0%) high severe
variance introduced by outliers: 69.726%
variance is severely inflated by outliers

benchmarking LowHex
mean: 123.6000 us, lb 123.0551 us, ub 124.7084 us, ci 0.950
std dev: 3.837497 us, lb 1.869370 us, ub 6.470112 us, ci 0.950
found 6 outliers among 100 samples (6.0%)
  4 (4.0%) high mild
  2 (2.0%) high severe
variance introduced by outliers: 25.818%
variance is moderately inflated by outliers

对于最初的 500 长sample,它是

benchmarking simple
mean: 2.603306 ms, lb 2.583054 ms, ub 2.629212 ms, ci 0.950
std dev: 116.5341 us, lb 81.61409 us, ub 191.3293 us, ci 0.950
found 7 outliers among 100 samples (7.0%)
  2 (2.0%) low severe
  3 (3.0%) low mild
  1 (1.0%) high severe
variance introduced by outliers: 42.490%
variance is moderately inflated by outliers

benchmarking unfoldrN_MS1
mean: 83.19349 us, lb 82.88474 us, ub 83.58283 us, ci 0.950
std dev: 1.771460 us, lb 1.486104 us, ub 2.174729 us, ci 0.950
found 14 outliers among 100 samples (14.0%)
  12 (12.0%) high mild
  2 (2.0%) high severe
variance introduced by outliers: 14.225%
variance is moderately inflated by outliers

benchmarking LowHex
mean: 24.50564 us, lb 24.41683 us, ub 24.61241 us, ci 0.950
std dev: 497.1908 ns, lb 415.6366 ns, ub 609.7594 ns, ci 0.950
found 5 outliers among 100 samples (5.0%)
  5 (5.0%) high mild
variance introduced by outliers: 13.256%
variance is moderately inflated by outliers
于 2012-04-11T09:55:57.477 回答
6

您正在寻找的函数是Data.ByteString.Builder.byteStringHex(或其用于 Lazy ByteStrings 的孪生函数),它由新的字节串生成器提供。我扩展了您的基准测试并在我的机器上获得了以下结果:

benchmarking size 5000/simple
mean: 2.469847 ms, lb 2.440422 ms, ub 2.522850 ms, ci 0.950
std dev: 196.5903 us, lb 116.8811 us, ub 318.4720 us, ci 0.950
found 16 outliers among 100 samples (16.0%)
  3 (3.0%) low severe
  2 (2.0%) low mild
  10 (10.0%) high severe
variance introduced by outliers: 70.721%
variance is severely inflated by outliers

benchmarking size 5000/unfoldrN_MS1
mean: 102.6075 us, lb 101.7695 us, ub 104.0159 us, ci 0.950
std dev: 5.468574 us, lb 3.681120 us, ub 8.080665 us, ci 0.950
found 16 outliers among 100 samples (16.0%)
  6 (6.0%) high mild
  10 (10.0%) high severe
variance introduced by outliers: 51.455%
variance is severely inflated by outliers

benchmarking size 5000/byteStringHexFixed
mean: 5.675204 us, lb 5.636296 us, ub 5.750211 us, ci 0.950
std dev: 264.3726 ns, lb 140.9738 ns, ub 398.8494 ns, ci 0.950
found 5 outliers among 100 samples (5.0%)
  4 (4.0%) high severe
variance introduced by outliers: 44.476%
variance is moderately inflated by outliers

我喜欢这个数字。太糟糕了,我的字节串库补丁仍在 Duncan Coutts 的审查中。最迟将在下一个 GHC 版本中提供新的字节串构建器。

于 2012-04-25T08:13:15.270 回答
2

好像刚用过

hex :: B.ByteString -> String
hex = concatMap (printf "%02x") . B.unpack

上次我想这样做。这与 Crypto.Hash 库 iirc 结合使用。我怀疑性能有那么好,但与(慢)sha512 函数本身相比,为什么十六进制转换会成为问题?

于 2012-04-19T05:35:50.237 回答