python - 消除 numpy 实现中的 for 循环

Question

我在 numpy 中有以下数据集

indices | real data (X)    |targets (y)
        |                  |
0   0   | 43.25 665.32 ... |2.4      } 1st block
0   0   | 11.234           |-4.5     }
0   1     ...               ...      } 2nd block
0   1                                } 
0   2                                } 3rd block
0   2                                }
1   0                                } 4th block
1   0                                }
1   0                                }
1   1                       ...
1   1                       
1   2
1   2
2   0
2   0 
2   1
2   1
2   1
...

论文是我的变量

idx1 = data[:,0]
idx2 = data[:,1]
X = data[:,2:-1]
y = data[:,-1]

我还有一个变量W，它是一个 3D 数组。

我需要在代码中做的是遍历数据集中的所有块，并在一些计算后为每个块返回一个标量数，然后对所有标量求和，并将其存储在一个名为cost. 问题是循环实现非常慢，所以如果可能的话，我会尝试将它矢量化。这是我当前的代码。在numpy中没有for循环是否可以做到这一点？

IDX1 = 0
IDX2 = 1

# get unique indices
idx1s = np.arange(len(np.unique(data[:,IDX1])))
idx2s = np.arange(len(np.unique(data[:,IDX2])))

# initialize global sum variable to 0
cost = 0
for i1 in idx1s:
    for i2 in idx2:

        # for each block in the dataset
        mask = np.nonzero((data[:,IDX1] == i1) & (data[:,IDX2] == i2))

        # get variables for that block
        curr_X = X[mask,:]
        curr_y = y[mask]
        curr_W = W[:,i2,i1]

        # calculate a scalar  
        pred = np.dot(curr_X,curr_W)
        sigm = 1.0 / (1.0 + np.exp(-pred))
        loss = np.sum((sigm- (0.5)) * curr_y)

        # add result to global cost
        cost += loss

这是一些示例数据

data = np.array([[0,0,5,5,7],
                 [0,0,5,5,7],
                 [0,1,5,5,7],
                 [0,1,5,5,7],
                 [1,0,5,5,7],
                 [1,1,5,5,7]])
W = np.zeros((2,2,2))
idx1 = data[:,0]
idx2 = data[:,1]
X = data[:,2:-1]
y = data[:,-1]

score 3 · Accepted Answer

这W很棘手......实际上，您的块非常无关紧要，除了获得正确的切片W来np.dot处理相应的X，所以我采用了创建aligned_W数组的简单方法，如下所示：

aligned_W = W[:, idx2, idx1]

这是一个形状数组，(2, rows)其中rows是数据集的行数。您现在可以在没有任何 for 循环的情况下继续进行整个计算，如下所示：

from numpy.core.umath_tests import inner1d
pred = inner1d(X, aligned_W.T)
sigm = 1.0 / (1.0 + np.exp(-pred))
loss = (sigm - 0.5) * curr_y
cost = np.sum(loss)

score 2 · Accepted Answer

我的猜测是您的代码运行缓慢的主要原因是以下行：

mask = np.nonzero((data[:,IDX1] == i1) & (data[:,IDX2] == i2))

因为您反复扫描输入数组以查找少量感兴趣的行。因此，您需要执行以下操作：

ni1 = len(np.unique(data[:,IDX1]))
ni2 = len(np.unique(data[:,IDX2]))
idx1s = np.arange(ni1)                         
idx2s = np.arange(ni2)

key = data[:,IDX1] * ni2 + data[:,IDX2] # 1D key to the rows

sortids = np.argsort(key) #indices to the sorted key

然后在循环内部而不是

mask=np.nonzero(...)

你需要做

curid = i1 * ni2 + i2
left = np.searchsorted(key, curid, 'left', sorter=sortids)
right=np.searchsorted(key, curid, 'right', sorter=sortids)
mask = sortids[left:right]

score 1 · Accepted Answer

我认为没有办法在不使用 for 循环的情况下比较不同大小的 numpy 数组。很难确定输出的含义和形状是什么

[0,1,2,3,4] == [3,4,2]

我能给你的唯一建议是使用以下方法摆脱 for 循环之一itertools.product：

import itertools as it

[...]

idx1s = np.unique(data[:,IDX1])
idx2s = np.unique(data[:,IDX2])

# initialize global sum variable to 0
cost = 0
for i1, i2 in it.product(idx1s, idx2):

    # for each block in the dataset
    mask = np.nonzero((data[:,IDX1] == i1) & (data[:,IDX2] == i2))

    # get variables for that block
    curr_X = X[mask,:]
    curr_y = y[mask]
    [...]

您也可以保留mask为布尔数组

mask = (data[:,IDX1] == i1) & (data[:,IDX2] == i2)

输出是相同的，无论如何您都必须使用内存来创建 bool 数组。这样做可以节省一些内存和函数评估

编辑

如果您知道索引没有孔或孔很少，则可能值得删除您定义的部分idx1s并将idxs2for 循环更改为

max1, max2 = data[:,[IDX1, IDX2]].max(axis=0)
for i1, i2 in it.product(xrange(max1), xrange(max2)):
    [...]

两者xrange和it.product都是迭代器，因此它们仅i1在i2您需要时创建。

ps：如果你在 python3.x 上使用range，而不是xrange

python - 消除 numpy 实现中的 for 循环

3 回答 3

Related

Reference