我np.einsum
用来计算图表中的材料流量(本例中为 1 个节点到 4 个节点)。流量由amount
(amount.shape == (1, 1, 2)
维度定义某些标准,我们称它们为a
, b
, c
)给出。
布尔矩阵根据( ; )的, ,标准route
确定允许的流量。我标注尺寸, , , 。等价于s 尺寸,是流动的方向(0、1、2 或 3)。为了确定 中的材料量,我计算并获得了一个 y-dim 向量,其中流入。路线还有一个隐含的优先级。例如, any用于any , any用于下一个更高的 y-dim 路线,依此类推。(last y-index) 定义了包罗万象的路线,也就是说,它的值是a
b
c
y
route.shape == (4, 1, 1, 2)
yabc
y
a
b
c
abc
amount
abc
y
y
np.einsum('abc,yabc->y', amount, route)
y
route[0, ...] == True
False
y=1..3
route[1, ...] == True
False
route[3, ...]
True
当之前的 y 索引值为 False ( (route[0] ^ route[1] ^ route[2] ^ route[3]).all() == True
) 时。
这工作正常。x
但是,当我引入另一个仅存在于 中route
而不存在于 中的标准(维度)时amount
,这种逻辑似乎被打破了。下面的代码演示了这个问题:
>>> import numpy as np
>>> amount = np.asarray([[[5000.0, 0.0]]])
>>> route = np.asarray([[[[[False, True]]], [[[False, True]]], [[[False, True]]]], [[[[True, False]]], [[[False, False]]], [[[False, False]]]], [[[[False, False]]], [[[True, False]]], [[[False, False]]]], [[[[False, False]]], [[[False, False]]], [[[True, False]]]]], dtype=bool)
>>> amount.shape
(1, 1, 2)
>>> Added dimension `x`
>>> # y,x,a,b,c
>>> route.shape
(4, 3, 1, 1, 2)
>>> # Attempt 1: `5000` can flow into y=1, 2 or 3. I expect
>>> # `flows1.sum() == amount.sum()` as it would be without `x`.
>>> # Correct solution would be `[0, 5000, 0, 0]` because material is routed
>>> # to y=1, and is not available for y=2 and y=3 as they are lower
>>> # priority (higher index)
>>> flows1 = np.einsum('abc,yxabc->y', amount, route)
>>> flows1
array([ 0., 5000., 5000., 5000.])
>>> # Attempt 2: try to collapse `x` => not much different, duplication
>>> np.einsum('abc,yabc->y', amount, route.any(1))
array([ 0., 5000., 5000., 5000.])
>>> # This is the flow by `y` and `x`. I'd only expect a `5000` in the
>>> # 2nd row (`[5000., 0., 0.]`) not the others.
>>> np.einsum('abc,yxabc->yx', amount, route)
array([[ 0., 0., 0.],
[5000., 0., 0.],
[ 0., 5000., 0.],
[ 0., 0., 5000.]])
是否有任何可行的操作可以应用route
(.all(1)
也不起作用)来忽略 x 维度?
另一个例子:
>>> amount2 = np.asarray([[[5000.0, 1000.0]]])
>>> np.einsum('abc,yabc->y', amount2, route.any(1))
array([1000., 5000., 5000., 5000.])
可以被解释为1000.0
被路由到y=0
(并且没有其他 y 目标)并且5000.0
与 destination 和 兼容,y=1
但理想情况下,我只想出现在(因为这是最低索引和最高目标优先级)。y=2
y=3
5000.0
y=1
解决方案尝试
下面的作品,但不是很麻木。如果可以消除循环,那就太好了。
# Initialise destination
result = np.zeros((route.shape[0]))
# Calculate flow by maintaining all dimensions (this will cause
# double ups because `x` is not part of `amount2`
temp = np.einsum('abc,yxabc->yxabc', amount2, route)
temp_ixs = np.asarray(np.where(temp))
# For each original amount, find the destination (`y`)
for a, b, c in zip(*np.where(amount2)):
# Find where dimensions `abc` are equal in the destination.
# Take the first vector which contains `yxabc` (we get `yx` as result)
ix = np.where((temp_ixs[2:].T == [a, b, c]).all(axis=1))[0][0]
y_ix = temp_ixs.T[ix][0]
# ignored
x_ix = temp_ixs.T[ix][1]
v = amount2[a, b, c]
# build resulting destination
result[y_ix] += v
# result == array([1000., 5000., 0., 0.])
换句话说,对于 中的每个值amount2
,我正在寻找最低的索引yx
,temp
以便可以将值写入result[y] = value
(x 被忽略)。
>>> temp = np.einsum('abc,yxabc->yx', amount2, route)
>>> temp
# +--- value=1000 at y=0 => result[0] += 1000
# /
array([[1000., 1000., 1000.],
# +--- value=5000 at y=1 => result[1] += 5000
# /
[5000., 0., 0.],
[ 0., 5000., 0.],
[ 0., 0., 5000.]])
>>> result
array([1000., 5000., 0., 0.])
>>> amount2
array([[[5000., 1000.]]])
另一种降低维度的尝试route
是:
>>> r = route.any(1)
>>> for x in xrange(1, route.shape[0]):
r[x] = r[x] & (r[:x] == False).all(axis=0)
>>> np.einsum('abc,yabc->y', amount2, r)
array([1000., 5000., 0., 0.])
这基本上保留了由 的第一维赋予的上述优先级route
。当较高优先级数组在该子索引处已经具有 True 值时,任何较低优先级(较高索引)数组都不能包含 True 值。虽然这比我的显式方法要好得多,但如果for x in xrange...
循环可以表示为 numpy 向量操作,那就太好了。