python - 使用 np.einsum 时忽略维度

Question

我np.einsum用来计算图表中的材料流量（本例中为 1 个节点到 4 个节点）。流量由amount（amount.shape == (1, 1, 2)维度定义某些标准，我们称它们为a, b, c）给出。

布尔矩阵根据( ; )的, ,标准route确定允许的流量。我标注尺寸, , , 。等价于s 尺寸，是流动的方向（0、1、2 或 3）。为了确定中的材料量，我计算并获得了一个 y-dim 向量，其中流入。路线还有一个隐含的优先级。例如， any用于any ， any用于下一个更高的 y-dim 路线，依此类推。(last y-index) 定义了包罗万象的路线，也就是说，它的值是abcyroute.shape == (4, 1, 1, 2)yabcyabcabcamountabcyynp.einsum('abc,yabc->y', amount, route)yroute[0, ...] == TrueFalsey=1..3route[1, ...] == TrueFalseroute[3, ...]True当之前的 y 索引值为 False ( (route[0] ^ route[1] ^ route[2] ^ route[3]).all() == True) 时。

这工作正常。x但是，当我引入另一个仅存在于中route而不存在于中的标准（维度）时amount，这种逻辑似乎被打破了。下面的代码演示了这个问题：

>>> import numpy as np

>>> amount = np.asarray([[[5000.0, 0.0]]]) 

>>> route = np.asarray([[[[[False, True]]], [[[False, True]]], [[[False, True]]]], [[[[True, False]]], [[[False, False]]], [[[False, False]]]], [[[[False, False]]], [[[True, False]]], [[[False, False]]]], [[[[False, False]]], [[[False, False]]], [[[True, False]]]]], dtype=bool)

>>> amount.shape
(1, 1, 2)

>>> Added dimension `x`
>>> # y,x,a,b,c
>>> route.shape
(4, 3, 1, 1, 2)

>>> # Attempt 1: `5000` can flow into y=1, 2 or 3. I expect
>>> # `flows1.sum() == amount.sum()` as it would be without `x`.
>>> # Correct solution would be `[0, 5000, 0, 0]` because material is routed
>>> # to y=1, and is not available for y=2 and y=3 as they are lower
>>> # priority (higher index)
>>> flows1 = np.einsum('abc,yxabc->y', amount, route)
>>> flows1
array([   0., 5000., 5000., 5000.])

>>> # Attempt 2: try to collapse `x` => not much different, duplication
>>> np.einsum('abc,yabc->y', amount, route.any(1))
array([   0., 5000., 5000., 5000.])

>>> # This is the flow by `y` and `x`. I'd only expect a `5000` in the
>>> # 2nd row (`[5000.,    0.,    0.]`) not the others.
>>> np.einsum('abc,yxabc->yx', amount, route)
array([[   0.,    0.,    0.],
       [5000.,    0.,    0.],
       [   0., 5000.,    0.],
       [   0.,    0., 5000.]])

是否有任何可行的操作可以应用route（.all(1)也不起作用）来忽略 x 维度？

另一个例子：

>>> amount2 = np.asarray([[[5000.0, 1000.0]]])
>>> np.einsum('abc,yabc->y', amount2, route.any(1))
array([1000., 5000., 5000., 5000.])

可以被解释为1000.0被路由到y=0（并且没有其他 y 目标）并且5000.0与 destination 和兼容，y=1但理想情况下，我只想出现在（因为这是最低索引和最高目标优先级）。y=2y=35000.0y=1

解决方案尝试

下面的作品，但不是很麻木。如果可以消除循环，那就太好了。

# Initialise destination
result = np.zeros((route.shape[0]))
# Calculate flow by maintaining all dimensions (this will cause
# double ups because `x` is not part of `amount2`
temp = np.einsum('abc,yxabc->yxabc', amount2, route)
temp_ixs = np.asarray(np.where(temp))

# For each original amount, find the destination (`y`)
for a, b, c in zip(*np.where(amount2)):
    # Find where dimensions `abc` are equal in the destination.
    # Take the first vector which contains `yxabc` (we get `yx` as result)
    ix = np.where((temp_ixs[2:].T == [a, b, c]).all(axis=1))[0][0]
    y_ix = temp_ixs.T[ix][0]
    # ignored
    x_ix = temp_ixs.T[ix][1]
    v = amount2[a, b, c]
    # build resulting destination
    result[y_ix] += v

# result == array([1000., 5000.,    0.,    0.])

换句话说，对于中的每个值amount2，我正在寻找最低的索引yx，temp以便可以将值写入result[y] = value（x 被忽略）。

>>> temp = np.einsum('abc,yxabc->yx', amount2, route)
>>> temp
        #  +--- value=1000 at y=0 => result[0] += 1000
        # /
array([[1000., 1000., 1000.],
        #  +--- value=5000 at y=1 => result[1] += 5000
        # /
       [5000.,    0.,    0.],
       [   0., 5000.,    0.],
       [   0.,    0., 5000.]])
>>> result
array([1000., 5000.,    0.,    0.])
>>> amount2
array([[[5000., 1000.]]])

另一种降低维度的尝试route是：

>>> r = route.any(1)
>>> for x  in xrange(1, route.shape[0]):
        r[x] = r[x] & (r[:x] == False).all(axis=0)

>>> np.einsum('abc,yabc->y', amount2, r)
array([1000., 5000.,    0.,    0.])

这基本上保留了由的第一维赋予的上述优先级route。当较高优先级数组在该子索引处已经具有 True 值时，任何较低优先级（较高索引）数组都不能包含 True 值。虽然这比我的显式方法要好得多，但如果for x in xrange...循环可以表示为 numpy 向量操作，那就太好了。

score 0 · Accepted Answer

我没有尝试遵循您对乘法问题的“流程”解释。我只关注计算选项。

去除不必要的维度，您的数组是：

In [194]: amount                                                                                       
Out[194]: array([5000.,    0.])
In [195]: route                                                                                        
Out[195]: 
array([[[0, 1],
        [0, 1],
        [0, 1]],

       [[1, 0],
        [0, 0],
        [0, 0]],

       [[0, 0],
        [1, 0],
        [0, 0]],

       [[0, 0],
        [0, 0],
        [1, 0]]])

计算yx是：

In [197]: np.einsum('a,yxa->yx',amount, route)                                                         
Out[197]: 
array([[   0.,    0.,    0.],
       [5000.,    0.,    0.],
       [   0., 5000.,    0.],
       [   0.,    0., 5000.]])

这只是route5000 倍的一部分。

In [198]: route[:,:,0]                                                                                 
Out[198]: 
array([[0, 0, 0],
       [1, 0, 0],
       [0, 1, 0],
       [0, 0, 1]])

省略xeinsum 的 RHS 上的会导致跨维度求和。

等效地，我们可以乘以（使用广播）：

In [200]: (amount*route).sum(axis=2)                                                                   
Out[200]: 
array([[   0.,    0.,    0.],
       [5000.,    0.,    0.],
       [   0., 5000.,    0.],
       [   0.,    0., 5000.]])
In [201]: (amount*route).sum(axis=(1,2))                                                               
Out[201]: array([   0., 5000., 5000., 5000.])

也许查看amount*route将有助于可视化问题。您还可以在一个或多个轴上使用max,min等argmax代替sum, 或与它一起使用。

python - 使用 np.einsum 时忽略维度

1 回答 1

Related

Reference