1

我有一个函数PI (input 0 or 1),它给出了PI[0] = -1, PI[1] = 1.

给定一个字节 B,我想要一个函数计算从左到右超过 PI 的最小过量。同样,我需要一个函数来计算从左到右超过 PI 的最大超出量。例子:

PI_MIN[0] = -8, PI_MAX[0] = -1

PI_MIN[1] = -7, PI_MAX[1] = -1

PI_MIN[2] = -6, PI_MAX[2] = -1

PI_MIN[3] = -6, PI_MAX[3] = -1

目前我预先计算函数值,将它们存储在通用表中,并在运行时访问它。或者,我天真地计算结果(for loop over bits)。因为我们有PI_MINPI_MAX

static constexpr int8_t PI_MIN[] { -8, -7, -6, -6, -6, -5, -5, -5, -6, -5, -4, -4, -4, -4, -4, -4, -6, -5, -4, -4, -4, -3, -3, -3, -4, -3, -3, -3, -3, -3, -3, -3, -6, -5, -4, -4, -4, -3, -3, -3, -4, -3, -2, -2, -2, -2, -2, -2, -4, -3, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -6, -5, -4, -4, -4, -3, -3, -3, -4, -3, -2, -2, -2, -2, -2, -2, -4, -3, -2, -2, -2, -1, -1, -1, -2, -1, -1, -1, -1, -1, -1, -1, -4, -3, -2, -2, -2, -1, -1, -1, -2, -1, -1, -1, -1, -1, -1, -1, -2, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -6, -5, -4, -4, -4, -3, -3, -3, -4, -3, -2, -2, -2, -2, -2, -2, -4, -3, -2, -2, -2, -1, -1, -1, -2, -1, -1, -1, -1, -1, -1, -1, -4, -3, -2, -2, -2, -1, -1, -1, -2, -1, 0, 0, 0, 0, 0, 0, -2, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -4, -3, -2, -2, -2, -1, -1, -1, -2, -1, 0, 0, 0, 0, 0, 0, -2, -1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, -2, -1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 };

static constexpr int8_t PI_MAX[] { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 0, -1, -1, -1, -1, -1, -1, -1, 0, -1, -1, -1, 0, 0, 0, 1, 2, -1, -1, -1, -1, -1, -1, -1, 0, -1, -1, -1, 0, 0, 0, 1, 2, 0, 0, 0, 0, 0, 0, 1, 2, 1, 1, 1, 2, 2, 2, 3, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 0, 0, 0, 0, 0, 0, 1, 2, 1, 1, 1, 2, 2, 2, 3, 4, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 2, 2, 3, 4, 2, 2, 2, 2, 2, 2, 3, 4, 3, 3, 3, 4, 4, 4, 5, 6, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 2, 2, 3, 4, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 2, 2, 3, 4, 2, 2, 2, 2, 2, 2, 3, 4, 3, 3, 3, 4, 4, 4, 5, 6, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 4, 2, 2, 2, 2, 2, 2, 3, 4, 3, 3, 3, 4, 4, 4, 5, 6, 3, 3, 3, 3, 3, 3, 3, 4, 3, 3, 3, 4, 4, 4, 5, 6, 4, 4, 4, 4, 4, 4, 5, 6, 5, 5, 5, 6, 6, 6, 7, 8 };

不幸的是,我找不到我需要使用的所有功能的模式(例如PI_MIN, PI_MAX,但还有更多)。问题是:如何找出是否存在可以以非天真的方式计算它的函数(即,输入字节中没有从左到右的 for 循环)。我的目标是达到最高性能,因为这个函数位于一个更大程序的最内层循环中。

我很感谢任何提示!

4

2 回答 2

0

pi_min 的非分支版本(假设循环已展开)。

/*
  Calculate:
    min(
      pi(b7),
      pi(b7)+pi(b6),
      pi(b7)+pi(b6)+pi(b5),
      pi(b7)+pi(b6)+pi(b5)+pi(b4),
      pi(b7)+pi(b6)+pi(b5)+pi(b4)+pi(b3),
      pi(b7)+pi(b6)+pi(b5)+pi(b4)+pi(b3)+pi(b2),
      pi(b7)+pi(b6)+pi(b5)+pi(b4)+pi(b3)+pi(b2)+pi(b1),
      pi(b7)+pi(b6)+pi(b5)+pi(b4)+pi(b3)+pi(b2)+pi(b1)+pi(b0))

  Where,
    pi(b) = b ? 1 : -1
  and bits in byte b are numbered with the least significant bit (LSB) as 0.

  This problem is essentially one of counting leading zeros where a string
  of leading zeros may be interrupted by a one if it is eventually followed
  by a zero. What happens if there are no leading zeros, then the count is -1.

  The algorithm uses two stacks, "c0" and "c1". c0 is the leading zero count
  and c1 is a stack of potentially intervening 1's.

  foreach bit (following 4 cases are mutually exclusive, only 1 will execute)
    0: if the '1' stack is empty => push a '0' onto the '0' stack
    0: if the '1' stack is not empty => pop a '1' 
    1: if the first bit is a '1' => put the '0' stack in underflow state
    1: if it is not the first bit => push a '1' onto the '1' stack
  return -c0 because zeros actually count as -1
*/
int pi_min(uint8_t byte) {
  int c0 = 0;
  int c1 = 0;

  for (int i = 0; i < 8; ++i) {
    uint8_t b = !!(byte & (1 << (7-i)));
    c0 -= (b & (i == 0));
    c0 += ((!b) & (0 >= c1));
    c1 -= ((!b) & (0 < c1));
    c1 += (b & (i != 0));
  }
  return -c0;
}

int pi_max(uint8_t byte) { return -pi_min(~byte); }

// The obvious version for comparison.
int pi(uint8_t bit) { return bit ? 1 : -1; }

int pi_min_simple(uint8_t byte) {
  int sum = 0;
  int m = 9;

  for (int i = 0; i < 8; ++i) {
    uint8_t b = byte & (1 << (7-i));
    sum += pi(b);
    m = std::min(m, sum);
  }
  return m;
}
于 2013-10-31T05:08:22.087 回答
0

抱歉耽搁了,我现在已经测量了不同方法的性能。

http://s12.postimg.org/v400xibxp/prefix_Sums.png

我很高兴看到 Adam Burry 提出的解决方案非常有效(黄线)。如您所见,对于最小和最大前缀和计算,即使是简单的算法也比查表(绿色和棕色线)略快,这确实非常相似......最令人惊讶的事情(至少对我来说)是maxExcess 的糟糕表现(正如 Adam Burry 建议的那样,它只是简单地返回-pi_min(~byte),其中 pi_min 是代表黄线的函数)。我想这与计算每个被分析字节的二进制补码的额外开销有关,所以我将切换到原始算法 (pi_min) 并返回 -c1 来实现 pi_max。

于 2013-11-06T23:29:27.903 回答