python - x 和 y 数组点的笛卡尔积到二维点的单个数组中

Question

我有两个 numpy 数组，它们定义了网格的 x 轴和 y 轴。例如：

x = numpy.array([1,2,3])
y = numpy.array([4,5])

我想生成这些数组的笛卡尔积来生成：

array([[1,4],[2,4],[3,4],[1,5],[2,5],[3,5]])

在某种程度上，这并不是非常低效，因为我需要在循环中多次执行此操作。我假设将它们转换为 Python 列表并使用itertools.product并返回到 numpy 数组并不是最有效的形式。

score 179 · Accepted Answer

规范的`cartesian_product`（几乎）

有许多方法可以解决这个具有不同属性的问题。有些比其他更快，有些更通用。经过大量测试和调整，我发现以下计算 n 维的函数cartesian_product对于许多输入来说比大多数其他函数更快。对于稍微复杂一些但在许多情况下甚至更快的方法，请参阅Paul Panzer的答案。

鉴于这个答案，这不再是我所知道的笛卡尔积的最快实现。numpy但是，我认为它的简单性将继续使其成为未来改进的有用基准：

def cartesian_product(*arrays):
    la = len(arrays)
    dtype = numpy.result_type(*arrays)
    arr = numpy.empty([len(a) for a in arrays] + [la], dtype=dtype)
    for i, a in enumerate(numpy.ix_(*arrays)):
        arr[...,i] = a
    return arr.reshape(-1, la)

值得一提的是，这个功能的使用ix_方式不同寻常；虽然文档中的用途ix_是为数组生成索引，但碰巧具有相同形状的数组可用于广播分配。非常感谢mgilson，他激励我尝试使用这种方式，以及unutbu，他提供了一些关于这个答案的非常有用的反馈，包括使用的建议。ix_numpy.result_type

值得注意的替代品

有时以 Fortran 顺序写入连续的内存块会更快。这就是这种替代方案的基础，cartesian_product_transpose在某些硬件上证明它比cartesian_product（见下文）更快。但是，使用相同原理的 Paul Panzer 的答案更快。不过，我在这里为感兴趣的读者提供了这个：

def cartesian_product_transpose(*arrays):
    broadcastable = numpy.ix_(*arrays)
    broadcasted = numpy.broadcast_arrays(*broadcastable)
    rows, cols = numpy.prod(broadcasted[0].shape), len(broadcasted)
    dtype = numpy.result_type(*arrays)

    out = numpy.empty(rows * cols, dtype=dtype)
    start, end = 0, rows
    for a in broadcasted:
        out[start:end] = a.reshape(-1)
        start, end = end, end + rows
    return out.reshape(cols, rows).T

在了解了 Panzer 的方法之后，我编写了一个几乎和他一样快的新版本，并且几乎就像这样简单cartesian_product：

def cartesian_product_simple_transpose(arrays):
    la = len(arrays)
    dtype = numpy.result_type(*arrays)
    arr = numpy.empty([la] + [len(a) for a in arrays], dtype=dtype)
    for i, a in enumerate(numpy.ix_(*arrays)):
        arr[i, ...] = a
    return arr.reshape(la, -1).T

这似乎有一些恒定时间开销，使其在小输入时比 Panzer 运行得慢。但是对于更大的输入，在我运行的所有测试中，它的性能和他最快的实现一样好（cartesian_product_transpose_pp）。

在接下来的部分中，我包括了对其他替代方案的一些测试。这些现在有些过时了，但出于历史兴趣，我决定将它们留在这里，而不是重复努力。有关最新测试，请参阅 Panzer 的答案以及Nico Schlömer的。

针对替代品的测试

以下是一组测试，它们显示了其中一些功能相对于许多替代方案提供的性能提升。numpy此处显示的所有测试均在运行 Mac OS 10.12.5、Python 3.6.1 和1.12.1的四核机器上执行。众所周知，硬件和软件的变化会产生不同的结果，所以 YMMV。为自己运行这些测试以确保！

定义：

import numpy
import itertools
from functools import reduce

### Two-dimensional products ###

def repeat_product(x, y):
    return numpy.transpose([numpy.tile(x, len(y)), 
                            numpy.repeat(y, len(x))])

def dstack_product(x, y):
    return numpy.dstack(numpy.meshgrid(x, y)).reshape(-1, 2)

### Generalized N-dimensional products ###

def cartesian_product(*arrays):
    la = len(arrays)
    dtype = numpy.result_type(*arrays)
    arr = numpy.empty([len(a) for a in arrays] + [la], dtype=dtype)
    for i, a in enumerate(numpy.ix_(*arrays)):
        arr[...,i] = a
    return arr.reshape(-1, la)

def cartesian_product_transpose(*arrays):
    broadcastable = numpy.ix_(*arrays)
    broadcasted = numpy.broadcast_arrays(*broadcastable)
    rows, cols = numpy.prod(broadcasted[0].shape), len(broadcasted)
    dtype = numpy.result_type(*arrays)

    out = numpy.empty(rows * cols, dtype=dtype)
    start, end = 0, rows
    for a in broadcasted:
        out[start:end] = a.reshape(-1)
        start, end = end, end + rows
    return out.reshape(cols, rows).T

# from https://stackoverflow.com/a/1235363/577088

def cartesian_product_recursive(*arrays, out=None):
    arrays = [numpy.asarray(x) for x in arrays]
    dtype = arrays[0].dtype

    n = numpy.prod([x.size for x in arrays])
    if out is None:
        out = numpy.zeros([n, len(arrays)], dtype=dtype)

    m = n // arrays[0].size
    out[:,0] = numpy.repeat(arrays[0], m)
    if arrays[1:]:
        cartesian_product_recursive(arrays[1:], out=out[0:m,1:])
        for j in range(1, arrays[0].size):
            out[j*m:(j+1)*m,1:] = out[0:m,1:]
    return out

def cartesian_product_itertools(*arrays):
    return numpy.array(list(itertools.product(*arrays)))

### Test code ###

name_func = [('repeat_product',                                                 
              repeat_product),                                                  
             ('dstack_product',                                                 
              dstack_product),                                                  
             ('cartesian_product',                                              
              cartesian_product),                                               
             ('cartesian_product_transpose',                                    
              cartesian_product_transpose),                                     
             ('cartesian_product_recursive',                           
              cartesian_product_recursive),                            
             ('cartesian_product_itertools',                                    
              cartesian_product_itertools)]

def test(in_arrays, test_funcs):
    global func
    global arrays
    arrays = in_arrays
    for name, func in test_funcs:
        print('{}:'.format(name))
        %timeit func(*arrays)

def test_all(*in_arrays):
    test(in_arrays, name_func)

# `cartesian_product_recursive` throws an 
# unexpected error when used on more than
# two input arrays, so for now I've removed
# it from these tests.

def test_cartesian(*in_arrays):
    test(in_arrays, name_func[2:4] + name_func[-1:])

x10 = [numpy.arange(10)]
x50 = [numpy.arange(50)]
x100 = [numpy.arange(100)]
x500 = [numpy.arange(500)]
x1000 = [numpy.arange(1000)]

测试结果：

In [2]: test_all(*(x100 * 2))
repeat_product:
67.5 µs ± 633 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
dstack_product:
67.7 µs ± 1.09 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
cartesian_product:
33.4 µs ± 558 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
cartesian_product_transpose:
67.7 µs ± 932 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
cartesian_product_recursive:
215 µs ± 6.01 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
cartesian_product_itertools:
3.65 ms ± 38.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [3]: test_all(*(x500 * 2))
repeat_product:
1.31 ms ± 9.28 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
dstack_product:
1.27 ms ± 7.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
cartesian_product:
375 µs ± 4.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
cartesian_product_transpose:
488 µs ± 8.88 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
cartesian_product_recursive:
2.21 ms ± 38.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
cartesian_product_itertools:
105 ms ± 1.17 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [4]: test_all(*(x1000 * 2))
repeat_product:
10.2 ms ± 132 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
dstack_product:
12 ms ± 120 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
cartesian_product:
4.75 ms ± 57.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
cartesian_product_transpose:
7.76 ms ± 52.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
cartesian_product_recursive:
13 ms ± 209 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
cartesian_product_itertools:
422 ms ± 7.77 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

在所有情况下，cartesian_product如本答案开头所定义的那样是最快的。

对于那些接受任意数量的输入数组的函数，也值得检查性能len(arrays) > 2。（直到我可以确定为什么cartesian_product_recursive在这种情况下会引发错误，我已将其从这些测试中删除。）

In [5]: test_cartesian(*(x100 * 3))
cartesian_product:
8.8 ms ± 138 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
cartesian_product_transpose:
7.87 ms ± 91.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
cartesian_product_itertools:
518 ms ± 5.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [6]: test_cartesian(*(x50 * 4))
cartesian_product:
169 ms ± 5.1 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
cartesian_product_transpose:
184 ms ± 4.32 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
cartesian_product_itertools:
3.69 s ± 73.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [7]: test_cartesian(*(x10 * 6))
cartesian_product:
26.5 ms ± 449 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
cartesian_product_transpose:
16 ms ± 133 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
cartesian_product_itertools:
728 ms ± 16 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [8]: test_cartesian(*(x10 * 7))
cartesian_product:
650 ms ± 8.14 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
cartesian_product_transpose:
518 ms ± 7.09 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
cartesian_product_itertools:
8.13 s ± 122 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

正如这些测试所示，cartesian_product在输入数组的数量增加到（大约）四个以上之前，它仍然具有竞争力。在那之后，cartesian_product_transpose确实有轻微的优势。

值得重申的是，使用其他硬件和操作系统的用户可能会看到不同的结果。例如，unutbu 报告使用 Ubuntu 14.04、Python 3.4.3 和numpy1.14.0.dev0+b7050a9 进行这些测试的结果如下：

>>> %timeit cartesian_product_transpose(x500, y500) 
1000 loops, best of 3: 682 µs per loop
>>> %timeit cartesian_product(x500, y500)
1000 loops, best of 3: 1.55 ms per loop

下面，我将详细介绍我在这些方面运行的早期测试。对于不同的硬件和不同版本的 Python 和numpy. 虽然它对使用最新版本的人没有立即有用numpy，但它说明了自此答案的第一个版本以来情况发生了怎样的变化。

一个简单的选择：`meshgrid`+`dstack`

当前接受的答案使用tileandrepeat一起广播两个数组。但该meshgrid功能实际上做同样的事情。这是传递给转置之前的tile输出repeat：

In [1]: import numpy
In [2]: x = numpy.array([1,2,3])
   ...: y = numpy.array([4,5])
   ...: 

In [3]: [numpy.tile(x, len(y)), numpy.repeat(y, len(x))]
Out[3]: [array([1, 2, 3, 1, 2, 3]), array([4, 4, 4, 5, 5, 5])]

这是输出meshgrid：

In [4]: numpy.meshgrid(x, y)
Out[4]: 
[array([[1, 2, 3],
        [1, 2, 3]]), array([[4, 4, 4],
        [5, 5, 5]])]

如您所见，它几乎完全相同。我们只需要重塑结果即可获得完全相同的结果。

In [5]: xt, xr = numpy.meshgrid(x, y)
   ...: [xt.ravel(), xr.ravel()]
Out[5]: [array([1, 2, 3, 1, 2, 3]), array([4, 4, 4, 5, 5, 5])]

meshgrid不过，我们可以将 to 的输出传递给之后再进行整形，而不是此时进行整形，这样可以dstack节省一些工作：

In [6]: numpy.dstack(numpy.meshgrid(x, y)).reshape(-1, 2)
Out[6]: 
array([[1, 4],
       [2, 4],
       [3, 4],
       [1, 5],
       [2, 5],
       [3, 5]])

与此评论中的说法相反，我没有看到任何证据表明不同的输入会产生不同形状的输出，并且如上所示，它们做的事情非常相似，所以如果他们这样做会很奇怪。如果您找到反例，请告诉我。

测试`meshgrid`+`dstack`与`repeat`+`transpose`

随着时间的推移，这两种方法的相对性能发生了变化。在早期版本的 Python (2.7) 中，使用meshgrid+的结果dstack对于小输入明显更快。（请注意，这些测试来自此答案的旧版本。）定义：

>>> def repeat_product(x, y):
...     return numpy.transpose([numpy.tile(x, len(y)), 
                                numpy.repeat(y, len(x))])
...
>>> def dstack_product(x, y):
...     return numpy.dstack(numpy.meshgrid(x, y)).reshape(-1, 2)
...

对于中等大小的输入，我看到了显着的加速。但是我在较新的机器上使用更新版本的 Python (3.6.1) 和numpy(1.12.1) 重试了这些测试。这两种方法现在几乎相同。

旧测试

>>> x, y = numpy.arange(500), numpy.arange(500)
>>> %timeit repeat_product(x, y)
10 loops, best of 3: 62 ms per loop
>>> %timeit dstack_product(x, y)
100 loops, best of 3: 12.2 ms per loop

新测试

In [7]: x, y = numpy.arange(500), numpy.arange(500)
In [8]: %timeit repeat_product(x, y)
1.32 ms ± 24.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [9]: %timeit dstack_product(x, y)
1.26 ms ± 8.47 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

与往常一样，YMMV，但这表明在 Python 和 numpy 的最新版本中，它们是可以互换的。

广义产品功能

一般来说，我们可能期望使用内置函数对于小输入会更快，而对于大输入，专用函数可能会更快。此外，对于广义的 n 维产品，tile也repeat无济于事，因为它们没有明确的高维类似物。因此，也值得研究专用函数的行为。

大多数相关测试出现在此答案的开头，但这里有一些在早期版本的 Python 上执行的测试并numpy用于比较。

另一个答案中定义的cartesian函数曾经在较大的输入中表现得很好。（与上面调用的函数相同。）为了比较，我们只使用两个维度。cartesian_product_recursivecartesiandstack_prodct

同样，旧测试显示出显着差异，而新测试几乎没有。

旧测试

>>> x, y = numpy.arange(1000), numpy.arange(1000)
>>> %timeit cartesian([x, y])
10 loops, best of 3: 25.4 ms per loop
>>> %timeit dstack_product(x, y)
10 loops, best of 3: 66.6 ms per loop

新测试

In [10]: x, y = numpy.arange(1000), numpy.arange(1000)
In [11]: %timeit cartesian([x, y])
12.1 ms ± 199 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [12]: %timeit dstack_product(x, y)
12.7 ms ± 334 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

和以前一样，dstack_product仍然以较小的比例跳动cartesian。

新测试（未显示多余的旧测试）

In [13]: x, y = numpy.arange(100), numpy.arange(100)
In [14]: %timeit cartesian([x, y])
215 µs ± 4.75 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [15]: %timeit dstack_product(x, y)
65.7 µs ± 1.15 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

我认为这些区别很有趣，值得记录。但他们最终是学术的。正如这个答案开头的测试所显示的那样，所有这些版本几乎总是比cartesian_product这个答案开头定义的要慢 - 这本身比这个问题的答案中最快的实现要慢一些。

score 110 · Accepted Answer

>>> numpy.transpose([numpy.tile(x, len(y)), numpy.repeat(y, len(x))])
array([[1, 4],
       [2, 4],
       [3, 4],
       [1, 5],
       [2, 5],
       [3, 5]])

有关计算 N 个数组的笛卡尔积的通用解决方案，请参阅使用 numpy 构建两个数组的所有组合的数组。

score 60 · Accepted Answer

你可以在 python 中做普通的列表理解

x = numpy.array([1,2,3])
y = numpy.array([4,5])
[[x0, y0] for x0 in x for y0 in y]

这应该给你

[[1, 4], [1, 5], [2, 4], [2, 5], [3, 4], [3, 5]]

score 40 · Accepted Answer

我对此也很感兴趣，并做了一些性能比较，可能比@senderle 的答案更清楚。

对于两个数组（经典案例）：

对于四个数组：

（请注意，这里的数组长度只有几十个条目。）

重现绘图的代码：

from functools import reduce
import itertools
import numpy
import perfplot


def dstack_product(arrays):
    return numpy.dstack(numpy.meshgrid(*arrays, indexing="ij")).reshape(-1, len(arrays))


# Generalized N-dimensional products
def cartesian_product(arrays):
    la = len(arrays)
    dtype = numpy.find_common_type([a.dtype for a in arrays], [])
    arr = numpy.empty([len(a) for a in arrays] + [la], dtype=dtype)
    for i, a in enumerate(numpy.ix_(*arrays)):
        arr[..., i] = a
    return arr.reshape(-1, la)


def cartesian_product_transpose(arrays):
    broadcastable = numpy.ix_(*arrays)
    broadcasted = numpy.broadcast_arrays(*broadcastable)
    rows, cols = reduce(numpy.multiply, broadcasted[0].shape), len(broadcasted)
    dtype = numpy.find_common_type([a.dtype for a in arrays], [])

    out = numpy.empty(rows * cols, dtype=dtype)
    start, end = 0, rows
    for a in broadcasted:
        out[start:end] = a.reshape(-1)
        start, end = end, end + rows
    return out.reshape(cols, rows).T


# from https://stackoverflow.com/a/1235363/577088
def cartesian_product_recursive(arrays, out=None):
    arrays = [numpy.asarray(x) for x in arrays]
    dtype = arrays[0].dtype

    n = numpy.prod([x.size for x in arrays])
    if out is None:
        out = numpy.zeros([n, len(arrays)], dtype=dtype)

    m = n // arrays[0].size
    out[:, 0] = numpy.repeat(arrays[0], m)
    if arrays[1:]:
        cartesian_product_recursive(arrays[1:], out=out[0:m, 1:])
        for j in range(1, arrays[0].size):
            out[j * m : (j + 1) * m, 1:] = out[0:m, 1:]
    return out


def cartesian_product_itertools(arrays):
    return numpy.array(list(itertools.product(*arrays)))


perfplot.show(
    setup=lambda n: 2 * (numpy.arange(n, dtype=float),),
    n_range=[2 ** k for k in range(13)],
    # setup=lambda n: 4 * (numpy.arange(n, dtype=float),),
    # n_range=[2 ** k for k in range(6)],
    kernels=[
        dstack_product,
        cartesian_product,
        cartesian_product_transpose,
        cartesian_product_recursive,
        cartesian_product_itertools,
    ],
    logx=True,
    logy=True,
    xlabel="len(a), len(b)",
    equality_check=None,
)

score 20 · Accepted Answer

截至 2017 年 10 月，numpy 现在有一个np.stack接受轴参数的通用函数。使用它，我们可以使用“dstack and meshgrid”技术获得“广义笛卡尔积”：

import numpy as np

def cartesian_product(*arrays):
    ndim = len(arrays)
    return (np.stack(np.meshgrid(*arrays), axis=-1)
              .reshape(-1, ndim))

a = np.array([1,2])
b = np.array([10,20])
cartesian_product(a,b)

# output:
# array([[ 1, 10],
#        [ 2, 10],
#        [ 1, 20],
#        [ 2, 20]])

注意axis=-1参数。这是结果中的最后一个（最里面的）轴。它相当于使用axis=ndim.

另一种评论，由于笛卡尔积很快爆发，除非我们出于某种原因需要在内存中实现数组，否则如果产品非常大，我们可能希望itertools即时使用和使用这些值。

score 20 · Accepted Answer

在@senderle 的示范性基础工作的基础上，我提出了两个版本——一个用于 C，一个用于 Fortran 布局——通常更快一些。

cartesian_product_transpose_pp是 - 与 @senderlecartesian_product_transpose完全不同的策略不同 - 一个版本cartesion_product使用更有利的转置内存布局 + 一些非常小的优化。
cartesian_product_pp坚持原来的内存布局。让它快速的原因是它使用连续复制。结果证明，连续副本的速度要快得多，即使只复制一个完整的内存块，即使其中只有一部分包含有效数据，也比只复制有效位更可取。

一些性能图。我为 C 和 Fortran 布局制作了单独的布局，因为这些是 IMO 不同的任务。

以“pp”结尾的名字是我的方法。

1）许多微小的因素（每个2个元素）

2）许多小因素（每个4个元素）

3）等长的三个因子

4) 两个等长的因子

代码（需要为每个绘图 b/c 单独运行我不知道如何重置；还需要适当地编辑/注释/注释）：

import numpy
import numpy as np
from functools import reduce
import itertools
import timeit
import perfplot

def dstack_product(arrays):
    return numpy.dstack(
        numpy.meshgrid(*arrays, indexing='ij')
        ).reshape(-1, len(arrays))

def cartesian_product_transpose_pp(arrays):
    la = len(arrays)
    dtype = numpy.result_type(*arrays)
    arr = numpy.empty((la, *map(len, arrays)), dtype=dtype)
    idx = slice(None), *itertools.repeat(None, la)
    for i, a in enumerate(arrays):
        arr[i, ...] = a[idx[:la-i]]
    return arr.reshape(la, -1).T

def cartesian_product(arrays):
    la = len(arrays)
    dtype = numpy.result_type(*arrays)
    arr = numpy.empty([len(a) for a in arrays] + [la], dtype=dtype)
    for i, a in enumerate(numpy.ix_(*arrays)):
        arr[...,i] = a
    return arr.reshape(-1, la)

def cartesian_product_transpose(arrays):
    broadcastable = numpy.ix_(*arrays)
    broadcasted = numpy.broadcast_arrays(*broadcastable)
    rows, cols = numpy.prod(broadcasted[0].shape), len(broadcasted)
    dtype = numpy.result_type(*arrays)

    out = numpy.empty(rows * cols, dtype=dtype)
    start, end = 0, rows
    for a in broadcasted:
        out[start:end] = a.reshape(-1)
        start, end = end, end + rows
    return out.reshape(cols, rows).T

from itertools import accumulate, repeat, chain

def cartesian_product_pp(arrays, out=None):
    la = len(arrays)
    L = *map(len, arrays), la
    dtype = numpy.result_type(*arrays)
    arr = numpy.empty(L, dtype=dtype)
    arrs = *accumulate(chain((arr,), repeat(0, la-1)), np.ndarray.__getitem__),
    idx = slice(None), *itertools.repeat(None, la-1)
    for i in range(la-1, 0, -1):
        arrs[i][..., i] = arrays[i][idx[:la-i]]
        arrs[i-1][1:] = arrs[i]
    arr[..., 0] = arrays[0][idx]
    return arr.reshape(-1, la)

def cartesian_product_itertools(arrays):
    return numpy.array(list(itertools.product(*arrays)))


# from https://stackoverflow.com/a/1235363/577088
def cartesian_product_recursive(arrays, out=None):
    arrays = [numpy.asarray(x) for x in arrays]
    dtype = arrays[0].dtype

    n = numpy.prod([x.size for x in arrays])
    if out is None:
        out = numpy.zeros([n, len(arrays)], dtype=dtype)

    m = n // arrays[0].size
    out[:, 0] = numpy.repeat(arrays[0], m)
    if arrays[1:]:
        cartesian_product_recursive(arrays[1:], out=out[0:m, 1:])
        for j in range(1, arrays[0].size):
            out[j*m:(j+1)*m, 1:] = out[0:m, 1:]
    return out

### Test code ###
if False:
  perfplot.save('cp_4el_high.png',
    setup=lambda n: n*(numpy.arange(4, dtype=float),),
                n_range=list(range(6, 11)),
    kernels=[
        dstack_product,
        cartesian_product_recursive,
        cartesian_product,
#        cartesian_product_transpose,
        cartesian_product_pp,
#        cartesian_product_transpose_pp,
        ],
    logx=False,
    logy=True,
    xlabel='#factors',
    equality_check=None
    )
else:
  perfplot.save('cp_2f_T.png',
    setup=lambda n: 2*(numpy.arange(n, dtype=float),),
    n_range=[2**k for k in range(5, 11)],
    kernels=[
#        dstack_product,
#        cartesian_product_recursive,
#        cartesian_product,
        cartesian_product_transpose,
#        cartesian_product_pp,
        cartesian_product_transpose_pp,
        ],
    logx=True,
    logy=True,
    xlabel='length of each factor',
    equality_check=None
    )

score 8 · Accepted Answer

我使用@kennytm回答了一段时间，但是当尝试在 TensorFlow 中做同样的事情时，我发现 TensorFlow 没有等效的numpy.repeat(). 经过一些实验，我想我找到了一个更通用的任意点向量的解决方案。

对于 numpy：

import numpy as np

def cartesian_product(*args: np.ndarray) -> np.ndarray:
    """
    Produce the cartesian product of arbitrary length vectors.

    Parameters
    ----------
    np.ndarray args
        vector of points of interest in each dimension

    Returns
    -------
    np.ndarray
        the cartesian product of size [m x n] wherein:
            m = prod([len(a) for a in args])
            n = len(args)
    """
    for i, a in enumerate(args):
        assert a.ndim == 1, "arg {:d} is not rank 1".format(i)
    return np.concatenate([np.reshape(xi, [-1, 1]) for xi in np.meshgrid(*args)], axis=1)

对于 TensorFlow：

import tensorflow as tf

def cartesian_product(*args: tf.Tensor) -> tf.Tensor:
    """
    Produce the cartesian product of arbitrary length vectors.

    Parameters
    ----------
    tf.Tensor args
        vector of points of interest in each dimension

    Returns
    -------
    tf.Tensor
        the cartesian product of size [m x n] wherein:
            m = prod([len(a) for a in args])
            n = len(args)
    """
    for i, a in enumerate(args):
        tf.assert_rank(a, 1, message="arg {:d} is not rank 1".format(i))
    return tf.concat([tf.reshape(xi, [-1, 1]) for xi in tf.meshgrid(*args)], axis=1)

score 8 · Accepted Answer

Scikit -learn包可以快速实现这一点：

from sklearn.utils.extmath import cartesian
product = cartesian((x,y))

请注意，如果您关心输出的顺序，则此实现的约定与您想要的不同。对于您的确切订购，您可以

product = cartesian((y,x))[:, ::-1]

score 4 · Accepted Answer

更一般地说，如果你有两个 2d numpy 数组 a 和 b，并且你想将 a 的每一行连接到 b 的每一行（行的笛卡尔积，有点像数据库中的连接），你可以使用这个方法：

import numpy
def join_2d(a, b):
    assert a.dtype == b.dtype
    a_part = numpy.tile(a, (len(b), 1))
    b_part = numpy.repeat(b, len(a), axis=0)
    return numpy.hstack((a_part, b_part))

score 3 · Accepted Answer

最快的方法是将生成器表达式与 map 函数结合使用：

import numpy
import datetime
a = np.arange(1000)
b = np.arange(200)

start = datetime.datetime.now()

foo = (item for sublist in [list(map(lambda x: (x,i),a)) for i in b] for item in sublist)

print (list(foo))

print ('execution time: {} s'.format((datetime.datetime.now() - start).total_seconds()))

输出（实际上打印了整个结果列表）：

[(0, 0), (1, 0), ...,(998, 199), (999, 199)]
execution time: 1.253567 s

或使用双生成器表达式：

a = np.arange(1000)
b = np.arange(200)

start = datetime.datetime.now()

foo = ((x,y) for x in a for y in b)

print (list(foo))

print ('execution time: {} s'.format((datetime.datetime.now() - start).total_seconds()))

输出（打印整个列表）：

[(0, 0), (1, 0), ...,(998, 199), (999, 199)]
execution time: 1.187415 s

考虑到大部分计算时间都用于打印命令。生成器计算在其他方面非常有效。不打印计算时间为：

execution time: 0.079208 s

对于生成器表达式 + 映射函数和：

execution time: 0.007093 s

对于双生成器表达式。

如果您真正想要的是计算每个坐标对的实际乘积，最快的方法是将其求解为 numpy 矩阵乘积：

a = np.arange(1000)
b = np.arange(200)

start = datetime.datetime.now()

foo = np.dot(np.asmatrix([[i,0] for i in a]), np.asmatrix([[i,0] for i in b]).T)

print (foo)

print ('execution time: {} s'.format((datetime.datetime.now() - start).total_seconds()))

输出：

 [[     0      0      0 ...,      0      0      0]
 [     0      1      2 ...,    197    198    199]
 [     0      2      4 ...,    394    396    398]
 ..., 
 [     0    997   1994 ..., 196409 197406 198403]
 [     0    998   1996 ..., 196606 197604 198602]
 [     0    999   1998 ..., 196803 197802 198801]]
execution time: 0.003869 s

并且没有打印（在这种情况下，它并没有节省太多，因为实际上只打印了一小块矩阵）：

execution time: 0.003083 s

score 2 · Accepted Answer

这也可以通过使用 itertools.product 方法轻松完成

from itertools import product
import numpy as np

x = np.array([1, 2, 3])
y = np.array([4, 5])
cart_prod = np.array(list(product(*[x, y])),dtype='int32')

结果：array([
[1, 4],
[1, 5],
[2, 4],
[2, 5],
[3, 4],
[3, 5]], dtype=int32)

执行时间：0.000155 s

score 1 · Accepted Answer

在需要对每一对执行简单操作（例如加法）的特定情况下，您可以引入一个额外的维度并让广播来完成这项工作：

>>> a, b = np.array([1,2,3]), np.array([10,20,30])
>>> a[None,:] + b[:,None]
array([[11, 12, 13],
       [21, 22, 23],
       [31, 32, 33]])

我不确定是否有任何类似的方法可以实际获得这些对。

score 1 · Accepted Answer

我参加聚会有点晚了，但我遇到了这个问题的一个棘手的变体。假设我想要几个数组的笛卡尔积，但笛卡尔积最终比计算机的内存大得多（但是，使用该积完成的计算速度很快，或者至少是可并行的）。

显而易见的解决方案是将这个笛卡尔积分成块，并一个接一个地处理这些块（以一种“流式”方式）。您可以使用轻松做到这一点itertools.product，但速度非常慢。此外，这里提出的解决方案（尽可能快）都没有给我们这种可能性。我提出的解决方案使用 Numba，并且比 cartesian_product 这里提到的“规范”稍快。这很长，因为我试图在任何地方优化它。

import numba as nb
import numpy as np
from typing import List


@nb.njit(nb.types.Tuple((nb.int32[:, :],
                         nb.int32[:]))(nb.int32[:],
                                       nb.int32[:],
                                       nb.int64, nb.int64))
def cproduct(sizes: np.ndarray, current_tuple: np.ndarray, start_idx: int, end_idx: int):
    """Generates ids tuples from start_id to end_id"""
    assert len(sizes) >= 2
    assert start_idx < end_idx

    tuples = np.zeros((end_idx - start_idx, len(sizes)), dtype=np.int32)
    tuple_idx = 0
    # stores the current combination
    current_tuple = current_tuple.copy()
    while tuple_idx < end_idx - start_idx:
        tuples[tuple_idx] = current_tuple
        current_tuple[0] += 1
        # using a condition here instead of including this in the inner loop
        # to gain a bit of speed: this is going to be tested each iteration,
        # and starting a loop to have it end right away is a bit silly
        if current_tuple[0] == sizes[0]:
            # the reset to 0 and subsequent increment amount to carrying
            # the number to the higher "power"
            current_tuple[0] = 0
            current_tuple[1] += 1
            for i in range(1, len(sizes) - 1):
                if current_tuple[i] == sizes[i]:
                    # same as before, but in a loop, since this is going
                    # to get called less often
                    current_tuple[i + 1] += 1
                    current_tuple[i] = 0
                else:
                    break
        tuple_idx += 1
    return tuples, current_tuple


def chunked_cartesian_product_ids(sizes: List[int], chunk_size: int):
    """Just generates chunks of the cartesian product of the ids of each
    input arrays (thus, we just need their sizes here, not the actual arrays)"""
    prod = np.prod(sizes)

    # putting the largest number at the front to more efficiently make use
    # of the cproduct numba function
    sizes = np.array(sizes, dtype=np.int32)
    sorted_idx = np.argsort(sizes)[::-1]
    sizes = sizes[sorted_idx]
    if chunk_size > prod:
        chunk_bounds = (np.array([0, prod])).astype(np.int64)
    else:
        num_chunks = np.maximum(np.ceil(prod / chunk_size), 2).astype(np.int32)
        chunk_bounds = (np.arange(num_chunks + 1) * chunk_size).astype(np.int64)
        chunk_bounds[-1] = prod
    current_tuple = np.zeros(len(sizes), dtype=np.int32)
    for start_idx, end_idx in zip(chunk_bounds[:-1], chunk_bounds[1:]):
        tuples, current_tuple = cproduct(sizes, current_tuple, start_idx, end_idx)
        # re-arrange columns to match the original order of the sizes list
        # before yielding
        yield tuples[:, np.argsort(sorted_idx)]


def chunked_cartesian_product(*arrays, chunk_size=2 ** 25):
    """Returns chunks of the full cartesian product, with arrays of shape
    (chunk_size, n_arrays). The last chunk will obviously have the size of the
    remainder"""
    array_lengths = [len(array) for array in arrays]
    for array_ids_chunk in chunked_cartesian_product_ids(array_lengths, chunk_size):
        slices_lists = [arrays[i][array_ids_chunk[:, i]] for i in range(len(arrays))]
        yield np.vstack(slices_lists).swapaxes(0,1)


def cartesian_product(*arrays):
    """Actual cartesian product, not chunked, still fast"""
    total_prod = np.prod([len(array) for array in arrays])
    return next(chunked_cartesian_product(*arrays, total_prod))


a = np.arange(0, 3)
b = np.arange(8, 10)
c = np.arange(13, 16)
for cartesian_tuples in chunked_cartesian_product(*[a, b, c], chunk_size=5):
    print(cartesian_tuples)

这将以 5 个 3-uples 的块输出我们的笛卡尔积：

[[ 0  8 13]
 [ 0  8 14]
 [ 0  8 15]
 [ 1  8 13]
 [ 1  8 14]]
[[ 1  8 15]
 [ 2  8 13]
 [ 2  8 14]
 [ 2  8 15]
 [ 0  9 13]]
[[ 0  9 14]
 [ 0  9 15]
 [ 1  9 13]
 [ 1  9 14]
 [ 1  9 15]]
[[ 2  9 13]
 [ 2  9 14]
 [ 2  9 15]]

如果您愿意理解这里正在做什么，该njitted函数背后的直觉是在一个奇怪的数字基础中枚举每个“数字”，其元素将由输入数组的大小组成（而不是常规中的相同数字）二进制、十进制或十六进制基数）。

显然，这种解决方案对于大型产品来说很有趣。对于小型的，开销可能有点贵。

注意：由于 numba 仍在大量开发中，我正在使用 numba 0.50 和 python 3.6 来运行它。

score 0 · Accepted Answer

还有一个：

>>>x1, y1 = np.meshgrid(x, y)
>>>np.c_[x1.ravel(), y1.ravel()]
array([[1, 4],
       [2, 4],
       [3, 4],
       [1, 5],
       [2, 5],
       [3, 5]])

score 0 · Accepted Answer

受阿什坎回答的启发，您也可以尝试以下方法。

>>> x, y = np.meshgrid(x, y)
>>> np.concatenate([x.flatten().reshape(-1,1), y.flatten().reshape(-1,1)], axis=1)

这将为您提供所需的笛卡尔积！

python - x 和 y 数组点的笛卡尔积到二维点的单个数组中

15 回答 15

规范的cartesian_product（几乎）

值得注意的替代品

针对替代品的测试

一个简单的选择：meshgrid+dstack

测试meshgrid+dstack与repeat+transpose

广义产品功能

Related

Reference

规范的`cartesian_product`（几乎）

一个简单的选择：`meshgrid`+`dstack`

测试`meshgrid`+`dstack`与`repeat`+`transpose`