python - 在 Python 中使用 matix 向量乘法进行速度优化

Question

我想在python中优化以下代码：

for imode in N.arange(3*natom): #Loop on perturbation (6 for 2 atoms)
  for ikpt in N.arange(nkpt):
    for iband in N.arange(nband):    
      for iatom1 in N.arange(natom):
        for iatom2 in N.arange(natom):
          for idir1 in N.arange(0,3):
            for idir2 in N.arange(0,3):    
              fan_corrQ[imode,ikpt,iband] += EIG2D[ikpt,iband,idir1,iatom1,idir2,iatom2]*\
                  displ_red_FAN2[imode,iatom1,iatom2,idir1,idir2]
              ddw_corrQ[imode,ikpt,iband] += ddw_save[ikpt,iband,idir1,iatom1,idir2,iatom2]*\
                  displ_red_DDW2[imode,iatom1,iatom2,idir1,idir2]

如您所见，我想对我的多空间 python 数组的一些索引求和。我想要类似的东西：

for imode in N.arange(3*natom): #Loop on perturbation (6 for 2 atoms)
  for ikpt in N.arange(nkpt):
    for iband in N.arange(nband):
      fan_corrQ[imode,ikpt,iband] = N.dot(EIG2D[ikpt,iband,:,:,:,:],displ_red_FAN2.T[imode,:,:,:,:])
      ddw_corrQ[imode,ikpt,iband] = N.dot(ddw_save[ikpt,iband,:,:,:,:],displ_red_DDW2.T[imode,:,:,:,:])

当然，我有一个问题是不能乘以相同的索引，所以我重新定义了它。我还必须指出，我正在处理复数：

displ_red_DDW2 = N.zeros((3*natom,3,natom,3,natom),dtype=complex)

我还在 ipython 中尝试了一个小的虚拟程序来测试它：

import numpy as N
atom =2
displ_red_FAN2 = N.zeros((3*natom,3,natom,3,natom),dtype=complex)
EIG2D = N.zeros((216,12,3,2,3,2))

所以我有 displ_red_FAN2.shape = (6, 3, 2, 3, 2) 和 EIG2D.shape = (216, 12, 3, 2, 3, 2)

因此，如果我执行以下操作：

N.dot(EIG2D[1,1,:,:,:,:],displ_red_FAN2[1,:,:,:,:].T).shape

它应该给出 (3,2,3,2) 但它给出的是 (3, 2, 3, 2, 3, 3) ??? 然后当乘法完成时，我想我将不得不做某种求和来降低维数。

任何帮助都会很棒！

干杯!

塞缪尔

score 4 · Accepted Answer

I think you could massively simplify this by using Einstein summation (np.einsum). The syntax can be a bit tricky to get your head around, so I've simplified the names of your variables and indices a bit:

# arrays
EIG2D           --> A
displ_red_FAN2  --> B
fan_corrQ       --> C

# indices
ikpt    --> i
iband   --> j
idir1   --> k
iatom1  --> l
idir2   --> m
iatom2  --> n
imode   --> o

np.einsum takes a comma-separated list of subscripts, each of which refers to a dimension of the corresponding input array. Whenever an index is repeated, it is summed in the output. You can also specify the order of summation in the output by giving the output indices as well.

In your case, I think that this:

...
fan_corrQ[imode,ikpt,iband] += EIG2D[ikpt,iband,idir1,iatom1,idir2,iatom2]*\
    displ_red_FAN2[imode,iatom1,iatom2,idir1,idir2]
...

should simplify to this:

C = np.einsum('ijklmn,olnkm->oij',A,B)

You should have a play around with this and make sure I haven't made any mistakes! As with sgpc's answer, similar caveats apply in terms of memory requirements.

score 3 · Accepted Answer

您可以使用np.tensordot()

fan_corrQ = np.tensordot(displ_red_FAN2, EIG2D, axes = ([3,1,4,2],[2,3,4,5]))
ddw_corrQ = np.tensordot(displ_red_DDW2, ddw_save, axes = ([3,1,4,2],[2,3,4,5]))

这给出了与您当前的方法相同的结果，并且9 times速度更快。

关于你的另一个问题。np.dot()for a对第一个数组的最后一个轴和第二个数组的倒数第二个轴ND-array求和。

python - 在 Python 中使用 matix 向量乘法进行速度优化

2 回答 2

Related

Reference