python - 使用组合或其他技巧来迭代 3 个不同的数组？

Question

考虑我的代码

a,b,c = np.loadtxt ('test.dat', dtype='double', unpack=True)

a、b 和 c 是相同的数组长度。

for i in range(len(a)):

   q[i] = 3*10**5*c[i]/100
   x[i] = q[i]*math.sin(a)*math.cos(b)
   y[i] = q[i]*math.sin(a)*math.sin(b)
   z[i] = q[i]*math.cos(a)

我试图找到 x,y,z 中 2 个点之间差异的所有组合来迭代这个方程 (xi-xj)+(yi-yj)+(zi-zj) = r

我用这个组合码

for combinations in it.combinations(x,2):
   xdist =  (combinations[0] - combinations[1])
for combinations in it.combinations(y,2):
   ydist =  (combinations[0] - combinations[1])
for combinations in it.combinations(z,2):
   zdist =  (combinations[0] - combinations[1])

r = (xdist + ydist +zdist)

对于我拥有的大文件，python 需要很长时间，我想知道是否有更快的方法来获取我的数组 for r，最好使用嵌套循环？

如

if i in range(?):
     if j in range(?):

score 3 · Accepted Answer

既然你显然在使用 numpy，那么让我们实际使用 numpy；它会快得多。如果您在使用 numpy 时完全避免 python 循环，并且使用它的向量化数组操作，它几乎总是更快并且通常更容易阅读。

a, b, c = np.loadtxt('test.dat', dtype='double', unpack=True)

q = 3e5 * c / 100  # why not just 3e3 * c?
x = q * np.sin(a) * np.cos(b)
y = q * np.sin(a) * np.sin(b)
z = q * np.cos(a)

现在，您的示例代码在此之后并没有做您可能希望它做的事情 - 请注意您xdist = ...每次都是怎么说的？您正在覆盖该变量，而不是对它做任何事情。不过，我将假设您想要每对点之间的平方欧几里得距离，并制作一个等于第 th 点和dists第th 点之间距离的矩阵。dists[i, j]ij

简单的方法，如果你有 scipy 可用：

# stack the points into a num_pts x 3 matrix
pts = np.hstack([thing.reshape((-1, 1)) for thing in (x, y, z)])

# get squared euclidean distances in a matrix
dists = scipy.spatial.squareform(scipy.spatial.pdist(pts, 'sqeuclidean'))

如果您的列表很大，那么不使用 squareform 会更节省内存，但是它是一种压缩格式，很难找到特定的距离对。

稍微难一点，如果你不能/不想使用 scipy：

pts = np.hstack([thing.reshape((-1, 1)) for thing in (x, y, z)])
sqnorms = np.sum(pts ** 2, axis=1)
dists = sqnorms.reshape((-1, 1)) - 2 * np.dot(pts, pts.T) + sqnorms

它基本上实现了公式 (a - b)^2 = a^2 - 2 ab + b^2，但都类似于矢量。

score 0 · Accepted Answer

好吧，您的计算的复杂性非常高。此外，如果要将所有r值存储在单个列表中，则需要大量内存。通常，您不需要列表，而生成器可能足以满足您想要对值执行的操作。

考虑这段代码：

def calculate(x, y, z):
    for xi, xj in combinations(x, 2):
        for yi, yj in combinations(y, 2):
            for zi, zj in combinations(z, 2):
                yield (xi - xj) + (yi - yj) + (zi - zj)

这将返回一个生成器，每次调用生成器的next()方法时该生成器只计算一个值。

gen = calculate(xrange(10), xrange(10, 20), xrange(20, 30))
gen.next() # returns -3
gen.next() # returns -4 and so on

score 0 · Accepted Answer

抱歉没有发布完整的解决方案，但您应该避免嵌套调用 range()，因为每次调用它都会创建一个新元组。您最好调用 range() 一次并存储结果，或者改用循环计数器。

例如，而不是：

max = 50

for number in range (0, 50):

    doSomething(number)

...你会这样做：

max = 50
current = 0

while current < max:

    doSomething(number)
    current += 1

python - 使用组合或其他技巧来迭代 3 个不同的数组？

3 回答 3

Related

Reference