0

我目前在许多具有 GPU 的服务器上运行 BOINC。

服务器运行 GPU 和 CPU BOINC 应用程序。

由于 AVX 和 SSE 在 CPU 应用程序中使用时会降低 CPU 频率,因此我必须选择一起运行的 CPU/GPU,因为某些 GPU 应用程序会遇到瓶颈(运行时间完成速度较慢),而其他应用程序则不会。

目前,一些 CPU 应用程序已命名,因此可以清楚地看到它们是否使用 AVX,但大多数不是。

因此,我是否可以运行任何命令以及某种查看方式,以查看当前运行的任何 CPU 应用程序是否正在使用 AVX 或 SSE(任何版本)?

另外作为旁注,我是否应该以相同的方式处理任何 FMA 使用(例如,由于 CPU 温度增加,它是否会减慢 CPU 频率)?

谢谢

4

1 回答 1

2

您可以使用perf top查看实时执行的 AVX 和 SSE 指令的数量以及可执行和共享库名称:

perf top -e fp_arith_inst_retired.128b_packed_single -e fp_arith_inst_retired.128b_packed_double -e fp_arith_inst_retired.256b_packed_single -e fp_arith_inst_retired.256b_packed_double

计数器描述(来自perf listIntel Coffee Lake CPU 的输出):

floating point:
  fp_arith_inst_retired.128b_packed_double          
       [Number of SSE/AVX computational 128-bit packed double precision floating-point instructions retired. Each count represents 2 computations. Applies to SSE* and AVX*
        packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform
        multiple calculations per element]
  fp_arith_inst_retired.128b_packed_single          
       [Number of SSE/AVX computational 128-bit packed single precision floating-point instructions retired. Each count represents 4 computations. Applies to SSE* and AVX*
        packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they
        perform multiple calculations per element]
  fp_arith_inst_retired.256b_packed_double          
       [Number of SSE/AVX computational 256-bit packed double precision floating-point instructions retired. Each count represents 4 computations. Applies to SSE* and AVX*
        packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform
        multiple calculations per element]
  fp_arith_inst_retired.256b_packed_single          
       [Number of SSE/AVX computational 256-bit packed single precision floating-point instructions retired. Each count represents 8 computations. Applies to SSE* and AVX*
        packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they
        perform multiple calculations per element]
  fp_arith_inst_retired.scalar_double               
       [Number of SSE/AVX computational scalar double precision floating-point instructions retired. Each count represents 1 computation. Applies to SSE* and AVX* scalar double
        precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element]
  fp_arith_inst_retired.scalar_single               
       [Number of SSE/AVX computational scalar single precision floating-point instructions retired. Each count represents 1 computation. Applies to SSE* and AVX* scalar single
        precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform multiple calculations
        per element]
  fp_assist.any                                     
       [Cycles with any input/output SSE or FP assist]
于 2020-02-20T23:01:09.583 回答