c++ - How large is the branch prediction buffer for a typical modern CPU?

Question

The application that I'm dealing with has a large number of a if-statements with the characteristics that in any one execution, only one of the branches is executed 90% of the time.

Now, I can test the impact of branch prediction on a single if-statement for a specific CPU by doing something like this :-

#include <iostream>
#include <stdlib.h>

using namespace std;

int main() {
  int a;
  cin>>a;
  srand(a);
  int b;

  long count=0;

  for (int i=0; i<10000; i++) {
    for (int j=0; j<65535; j++) {
      b = rand() % 30 + 1;
      if (b > 15) // This can be changed to get statistics for different %-ages
        count += (b+10);
    }
  }

  cout << count <<"\n";
}

My question is, is there a way of testing the scalability and impact of branch prediction with multiple if-statements in an actual large application for a given CPU?

Basically, I want to be able to figure out how much branch mispredicts are costing on various CPUs and their impact on the application.

score 4 · Accepted Answer

您需要考虑分支的复杂性，编译器可能会使用特定于架构的操作代码（如 CMOV（比较和移动））来删除分支。

您的简单示例代码

if (b > 15)
    count += (b+10);

这是编译成机器语言的代码

;; assembly x86 FASM/NASM syntax

;; WITH branching
MOV ebx, [b] ;; b
MOV ecx, [count] ;; count
CMP ebx, 15 ;; if condition to set flags
JLE .skip ;; { branch/jump over the if body when less than or equal
LEA eax, [ecx + ebx + 10] ;; count + b+10
MOV [count], eax ;; store count
.skip: ;; } label after the if block

;; WITHOUT branching
MOV ebx, [b] ;; b
MOV ecx, [count] ;; count
LEA eax, [ecx + ebx + 10] ;; pre-calc avoiding the need to branch
CMP ebx, 15 ;; if condition to set flags
CMOVLE eax, ecx ;; make eax equal to ecx (current count) when less than or equal
            ;; avoiding the branch/jump
MOV [count], eax ;; store count

所以除非你知道你的优化编译器是如何优化你的代码的，否则分析分支预测有点困难。如果您正在检查您的机器代码输出并且知道您有很多 J[condition] 语句，那么使用注释中提到的代码分析工具就足够了。尝试在不使用适当的架构调试寄存器的情况下进行自己的分支预测测试将导致我在上面演示的情况。

c++ - How large is the branch prediction buffer for a typical modern CPU?

1 回答 1

Related

Reference