我正在尝试评估现有基准测试和其他应用程序的火箭核心性能。
在我使用基准代码 MT-MATMUL 运行模拟器并查看 mt-matmul.riscv.out 后,我注意到有很多停顿。有人可以向我解释如何确定失速的原因吗?我认为应该预测这个简单的循环,并且不应该有一些真正减慢处理器的停顿。
见以下日志:
C0: 1956 [1] pc=[0000000628] W[r 0=0000000000001418][0] R[r 5=0000000000001418] R[r 0=0000000000000000] inst=[0002b107] fld ft2, 0(t0)
C0: 1957 [1] pc=[000000062c] W[r 0=0000000000007730][0] R[r16=0000000000007730] R[r 0=0000000000000000] inst=[00083007] fld ft0, 0(a6)
C0: 1958 [1] pc=[0000000630] W[r 0=0000000000003380][0] R[r 6=0000000000003380] R[r 0=0000000000000000] inst=[00033087] fld ft1, 0(t1)
**C0: 1959 [1] pc=[0000000634] W[r16=0000000000007738][1] R[r16=0000000000007730] R[r 8=00000000000000013] inst=[00880813] addi a6, a6, 8
C0: 1960 [0] pc=[0000000634] W[r 0=0000000000007738][0] R[r16=0000000000007730] R[r 8=0000000000000013] inst=[00880813] addi a6, a6, 8
C0: 1961 [0] pc=[0000000634] W[r 0=0000000000007738][0] R[r16=0000000000007730] R[r 8=0000000000000013] inst=[00880813] addi a6, a6, 8**
C0: 1962 [1] pc=[0000000638] W[r17=0000000000000014][1] R[r17=0000000000000013] R[r 1=0000000000000013] inst=[0018889b] addiw a7, a7, 1
C0: 1963 [1] pc=[000000063c] W[r 0=0000000000000001][0] R[r 1=0000000000000013] R[r 2=0000000000000013] inst=[0220f043] fmadd.d ft00, ft1, ft2
C0: 1964 [1] pc=[0000000640] W[r 5=0000000000001420][1] R[r 5=0000000000001418] R[r 8=0000000000000013] inst=[00828293] addi t0, t0, 8
C0: 1965 [0] pc=[0000000640] W[r 0=0000000000001420][0] R[r 5=0000000000001418] R[r 8=0000000000000013] inst=[00828293] addi t0, t0, 8
C0: 1966 [0] pc=[0000000640] W[r 0=00000000000001420][0] R[r 5=0000000000001418] R[r 8=0000000000000013] inst=[00828293] addi t0, t0, 8
C0: 1967 [1] pc=[0000000644] W[r 0=0000000000007730][0] R[r16=0000000000007738] R[r 0=0000000000000000] inst=[fe083c27] fsd ft0, -8(a6)
C0: 1968 [1] pc=[0000000648] W[r 0=0000000000000001][0] R[r12=0000000000000020] R[r17=0000000000000014] inst=[ff1610e3] bne a2, a7, pc - 32
C0: 1969 [1] pc=[0000000628] W[r 0=0000000000001420][0] R[r 5=0000000000001420] R[r 0=0000000000000000] inst=[0002b107] fld ft2, 0(t0)
C0: 1970 [1] pc=[000000062c] W[r 0=0000000000007738][0] R[r16=0000000000007738] R[r 0=0000000000000000] inst=[00083007] fld ft0, 0(a6)
C0: 1971 [1] pc=[0000000630] W[r 0=0000000000003380][0] R[r 6=0000000000003380] R[r 0=0000000000000000] inst=[00033087] fld ft1, 0(t1)
C0: 1972 [1] pc=[0000000634] W[r16=0000000000007740][1] R[r16=0000000000007738] R[r 8=0000000000000017] inst=[00880813] addi a6, a6, 8
C0: 1973 [0] pc=[0000000634] W[r 0=0000000000007740][0] R[r16=0000000000007738] R[r 8=0000000000000017] inst=[00880813] addi a6, a6, 8
C0: 1974 [0] pc=[0000000634] W[r 0=0000000000007740][0] R[r16=0000000000007738] R[r 8=0000000000000017] inst=[00880813] addi a6, a6, 8
C0: 1975 [1] pc=[0000000638] W[r17=0000000000000015][1] R[r17=0000000000000014] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
**C0: 1982 [1] pc=[0000000628] W[r 0=0000000000001428][0] R[r 5=0000000000001428] R[r 0=0000000000000000] inst=[0002b107] fld ft2, 0(t0)
C0: 1983 [1] pc=[000000062c] W[r 0=0000000000007740][0] R[r16=0000000000007740] R[r 0=0000000000000000] inst=[00083007] fld ft0, 0(a6)
C0: 1984 [1] pc=[0000000630] W[r 0=0000000000003380][0] R[r 6=0000000000003380] R[r 0=0000000000000000] inst=[00033087] fld ft1, 0(t1)
C0: 1985 [1] pc=[0000000634] W[r16=0000000000007748][1] R[r16=0000000000007740] R[r 8=0000000000000017] inst=[00880813] addi a6, a6, 8
C0: 1986 [0] pc=[0000000634] W[r 0=0000000000007748][0] R[r16=0000000000007740] R[r 8=0000000000000017] inst=[00880813] addi a6, a6, 8
C0: 1987 [0] pc=[0000000634] W[r 0=0000000000007748][0] R[r16=0000000000007740] R[r 8=0000000000000017] inst=[00880813] addi a6, a6, 8
C0: 1988 [1] pc=[0000000638] W[r17=0000000000000016][1] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 1989 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 1990 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 1991 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 1992 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 1993 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 1994 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 1995 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 1996 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 1997 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 1998 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 1999 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 2000 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 2001 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 2002 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 2003 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 2004 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 2005 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 2006 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 2007 [0] pc=[0000000638] W[r 0=00000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 2008 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 2009 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 2010 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 2011 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 2012 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 2013 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 2014 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 2015 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 2016 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 2017 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 2018 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 2019 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 2020 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1
C0: 2021 [0] pc=[0000000638] W[r 0=0000000000000016][0] R[r17=0000000000000015] R[r 1=0000000000000017] inst=[0018889b] addiw a7, a7, 1**
汇编代码很简单
5e8: 02b645bb divw a1,a2,a1
5ec: 02a5853b mulw a0,a1,a0
5f0: 00a585bb addw a1,a1,a0
5f4: 06b55e63 ble a1,a0,670 <matmul+0x88>
5f8: 02a6083b mulw a6,a2,a0
5fc: 00361e93 slli t4,a2,0x3
600: 00381813 slli a6,a6,0x3
604: 010787b3 add a5,a5,a6
608: 010686b3 add a3,a3,a6
60c: 04c05863 blez a2,65c <matmul+0x74>
610: 00070e13 mv t3,a4
614: 00068313 mv t1,a3
618: 00000393 li t2,0
61c: 000e0293 mv t0,t3
620: 00078813 mv a6,a5
624: 00000893 li a7,0
628: 0002b107 fld ft2,0(t0)
62c: 00083007 fld ft0,0(a6)
630: 00033087 fld ft1,0(t1) # 3000 <input2_data+0x1c80>
634: 00880813 addi a6,a6,8
638: 0018889b addiw a7,a7,1
63c: 0220f043 fmadd.d ft0,ft1,ft2,ft0
640: 00828293 addi t0,t0,8
644: fe083c27 fsd ft0,-8(a6)
648: ff1610e3 bne a2,a7,628 <matmul+0x40>
64c: 0013839b addiw t2,t2,1
650: 01de0e33 add t3,t3,t4
654: 00830313 addi t1,t1,8
658: fc7612e3 bne a2,t2,61c <matmul+0x34>
65c: 0015051b addiw a0,a0,1
660: 01d787b3 add a5,a5,t4
664: 01d686b3 add a3,a3,t4
668: fab512e3 bne a0,a1,60c <matmul+0x24>