我正在尝试调整以下示例程序,以在我的实验中用作粗粒度并行基准。
我在代码中添加了以下几行:
START_TIME = MPI_WTIME() * <- added this
CALL PDGESV( N, NRHS, MEM( IPA ), 1, 1, DESCA, MEM( IPPIV ),
$ MEM( IPB ), 1, 1, DESCB, INFO )
*
IF( MYROW.EQ.0 .AND. MYCOL.EQ.0 ) THEN
WRITE( NOUT, FMT = * )
WRITE( NOUT, FMT = * ) 'INFO code returned by PDGESV = ', INFO
WRITE( NOUT, FMT = * )
WRITE( NOUT, FMT = * ) 'Matrix X = A^{-1} * B'
WRITE( NOUT, FMT = * )
END IF
CALL PDLAPRNT( N, NRHS, MEM( IPB ), 1, 1, DESCB, 0, 0, 'X', NOUT,
$ MEM( IPW ) )
CALL PDLAWRITE( 'SCAEXSOL.dat', N, NRHS, MEM( IPB ), 1, 1, DESCB,
$ 0, 0, MEM( IPW ) )
*
* Compute residual ||A * X - B|| / ( ||X|| * ||A|| * eps * N )
EPS = PDLAMCH( ICTXT, 'Epsilon' )
ANORM = PDLANGE( 'I', N, N, MEM( IPA ), 1, 1, DESCA, MEM( IPW ) )
BNORM = PDLANGE( 'I', N, NRHS, MEM( IPB ), 1, 1, DESCB,
$ MEM( IPW ) )
CALL PDGEMM( 'No transpose', 'No transpose', N, NRHS, N, ONE,
$ MEM( IPACPY ), 1, 1, DESCA, MEM( IPB ), 1, 1, DESCB,
$ -ONE, MEM( IPX ), 1, 1, DESCX )
XNORM = PDLANGE( 'I', N, NRHS, MEM( IPX ), 1, 1, DESCX,
$ MEM( IPW ) )
RESID = XNORM / ( ANORM * BNORM * EPS * DBLE( N ) )
ELAPSED_TIME = MPI_WTIME() - START_TIME * <- added this
*
IF( MYROW.EQ.0 .AND. MYCOL.EQ.0 ) THEN
WRITE( NOUT, FMT = * )
WRITE( NOUT, FMT = * )
$ '||A * X - B|| / ( ||X|| * ||A|| * eps * N ) = ', RESID
WRITE( NOUT, FMT = * )
IF( RESID.LT.10.0D+0 ) THEN
WRITE( NOUT, FMT = * ) 'The answer is correct.'
WRITE( NOUT, FMT = * ) 1000.0*ELAPSED_TIME * <- added this
ELSE
WRITE( NOUT, FMT = * ) 'The answer is suspicious.'
WRITE( NOUT, FMT = * ) 1000.0*ELAPSED_TIME * <- added this
END IF
END IF
现在,我得到的经过时间似乎根本不一致 - 多次运行导致执行时间完全不同。
我正在使用 qsub 将其作为集群作业运行。有没有办法通过预订系统获取执行时间,而无需更改代码?
对于我的实验,我需要少量的大块。当我尝试增加 SCAEX.dat 中的块大小时:
例如来自:
'ScaLAPACK Example Program 2'
'May 1997'
'SCAEX.out' output file name (if any)
400 device out
400 value of N
400 value of NRHS
200 values of NB
2 values of NPROW
2 values of NPCOL
至:
'ScaLAPACK Example Program 2'
'May 1997'
'SCAEX.out' output file name (if any)
400 device out
400 value of N
400 value of NRHS
400 values of NB
1 values of NPROW
1 values of NPCOL
我得到:
Unable to perform test: need TOTMEM of at least 5126408
Bad MEMORY parameters: going on to next test case.