python - numpy 更高效的子矩阵

Question

目前我正在使用下面的代码来获取删除第 i 行和第 j 列的子矩阵，但是在分析我的代码之后，它似乎是我代码中的主要瓶颈之一。有没有更有效的方法？

def submatrix(A, i, j):
    logger.debug('submatrix(%r, %r, %r)', A, i, j)
    B = empty(shape=tuple(x - 1 for x in A.shape), dtype=int)
    B[:i, :j] = A[:i, :j]
    B[i:, :j] = A[i+1:, :j]
    B[:i, j:] = A[:i, j+1:]
    B[i:, j:] = A[i+1:, j+1:]
    return B

         25015049 function calls (24599369 primitive calls) in 44.587 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  3983040   15.541    0.000   20.719    0.000 defmatrix.py:301(__getitem__)
   415680   10.216    0.000   33.069    0.000 hill.py:127(submatrix)
 415686/6    3.232    0.000   44.578    7.430 hill.py:112(det)

编辑：Jaime 提供了一种使用正则逆和行列式来近似模逆的好方法，但是对于大基数（在我的情况下为模 256），不准确性足以使整个事情变得毫无意义。主要时间接收器似乎实际上是numpy中的 getitem ，但我相信这是由以下几行引起的：

    B[:i, :j] = A[:i, :j]
    B[i:, :j] = A[i+1:, :j]
    B[:i, j:] = A[:i, j+1:]
    B[i:, j:] = A[i+1:, j+1:]

瓶颈可能不是在内存中复制矩阵，而是矩阵条目访问。

score 1 · Accepted Answer

无论您如何处理子矩阵，使用行列式计算矩阵的逆是一种非常缓慢的方法。让我们举一个愚蠢的例子：

a = np.array([[3, 0, 2],
              [2, 0, -2],
              [0, 1, 1]])

您可以快速计算逆：

>>> np.linalg.inv(a)
array([[ 0.2,  0.2,  0. ],
       [-0.2,  0.3,  1. ],
       [ 0.2, -0.3, -0. ]])

但是要计算模逆，您需要将其作为整数矩阵除以整数因子。该整数因子当然是决定因素，因此您可以执行以下操作：

>>> np.linalg.inv(a) * np.linalg.det(a)
array([[  2.,   2.,   0.],
       [ -2.,   3.,  10.],
       [  2.,  -3.,  -0.]])

的倒数a是这个整数矩阵，除以的行列式a。作为一个函数，你可以这样做：

def extended_euclidean_algorithm(a, b) :
    """
    Computes a solution to a x + b y = gcd(a,b), as well as gcd(a,b),
    using the extended Euclidean algorithm.
    """
    if b == 0 :
        return 1, 0, a
    else :
        x, y, gcd = extended_euclidean_algorithm(b, a % b)
        return y, x - y * (a // b), gcd

def mmi(a, m) :
    """
    Computes the modular multiplicative inverse of a modulo m, using the
    extended Euclidean algorithm.
    """
    x, y, gcd = extended_euclidean_algorithm(a, m)
    if gcd == 1 :
        return x % m
    else :
        return None

def modular_inv(a, m):
    det_a = np.linalg.det(a)
    inv_a = np.linalg.inv(a) * det_a
    det_a = np.rint(det_a).astype(int)
    inv_a = np.rint(inv_a).astype(int)
    return ((inv_a % m) * mmi(det_a, m)) % m

现在：

>>> a = np.random.randint(10, size=(10, 10))
>>> b = modular_inv(a, 7)
>>> a.dot(b) % 7
array([[1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 1, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 1, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 1, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 1]])

score 1 · Accepted Answer

据我所知，submatrix只需删除第ith 行和j第 th 列。你可以这样做np.delete

i = 3
j = 4
a = np.arange(100).reshape(10,10)
b = np.delete(np.delete(a, i, 0), j, 1)

但是，对于@Jaime 引用的理由，这实际上更慢:-/

timeit submatrix(a, i, j)
#10000 loops, best of 3: 23.2 us per loop

timeit subdel(a, i, j)
#10000 loops, best of 3: 42.6 us per loop

但我暂时把它留在这里。

score 1 · Accepted Answer

嗯...您只是在复制矩阵，因此可能很难加快速度，但是您可以尝试的一件事是验证 A 是否位于连续的内存块中，这可以加快 C 代码的访问速度。看看numpy.ascontiguousarray()。

python - numpy 更高效的子矩阵

3 回答 3

Related

Reference