python - 转置/解压缩功能（zip的倒数）？

Question

我有一个 2 项元组的列表，我想将它们转换为 2 个列表，其中第一个包含每个元组中的第一个项目，第二个列表包含第二个项目。

例如：

original = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
# and I want to become...
result = (['a', 'b', 'c', 'd'], [1, 2, 3, 4])

有没有内置函数可以做到这一点？

score 856 · Accepted Answer

zip是它自己的逆！前提是您使用特殊的 * 运算符。

>>> zip(*[('a', 1), ('b', 2), ('c', 3), ('d', 4)])
[('a', 'b', 'c', 'd'), (1, 2, 3, 4)]

它的工作方式是zip使用参数调用：

zip(('a', 1), ('b', 2), ('c', 3), ('d', 4))

...除了参数zip直接传递给（在转换为元组之后），因此无需担心参数的数量会变得太大。

score 30 · Accepted Answer

你也可以做

result = ([ a for a,b in original ], [ b for a,b in original ])

它应该可以更好地扩展。特别是如果 Python 擅长不扩展列表推导，除非需要。

（顺便说一句，它创建了一个 2 元组（对）列表，而不是像zip这样的元组列表。）

如果可以使用生成器而不是实际列表，则可以这样做：

result = (( a for a,b in original ), ( b for a,b in original ))

在您请求每个元素之前，生成器不会遍历列表，但另一方面，它们确实保留对原始列表的引用。

score 22 · Accepted Answer

我喜欢zip(*iterable)在我的程序中使用（这是您正在寻找的一段代码）：

def unzip(iterable):
    return zip(*iterable)

我发现unzip更具可读性。

score 21 · Accepted Answer

如果您的列表长度不同，您可能不想按照帕特里克的回答使用 zip 。这有效：

>>> zip(*[('a', 1), ('b', 2), ('c', 3), ('d', 4)])
[('a', 'b', 'c', 'd'), (1, 2, 3, 4)]

但是对于不同长度的列表， zip 将每个项目截断为最短列表的长度：

>>> zip(*[('a', 1), ('b', 2), ('c', 3), ('d', 4), ('e', )])
[('a', 'b', 'c', 'd', 'e')]

您可以使用不带函数的 map 以 None 填充空结果：

>>> map(None, *[('a', 1), ('b', 2), ('c', 3), ('d', 4), ('e', )])
[('a', 'b', 'c', 'd', 'e'), (1, 2, 3, 4, None)]

zip() 稍微快一点。

score 15 · Accepted Answer

>>> original = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
>>> tuple([list(tup) for tup in zip(*original)])
(['a', 'b', 'c', 'd'], [1, 2, 3, 4])

给出问题中的列表元组。

list1, list2 = [list(tup) for tup in zip(*original)]

解压缩这两个列表。

score 8 · Accepted Answer

8

于 2018-12-21T12:46:32.510 回答

score 4 · Accepted Answer

这只是另一种方法，但它对我帮助很大，所以我在这里写：

有这个数据结构：

X=[1,2,3,4]
Y=['a','b','c','d']
XY=zip(X,Y)

导致：

In: XY
Out: [(1, 'a'), (2, 'b'), (3, 'c'), (4, 'd')]

在我看来，解压缩并返回原始文件的更 Pythonic 方式是：

x,y=zip(*XY)

但这会返回一个元组，因此如果您需要一个列表，您可以使用：

x,y=(list(x),list(y))

score 4 · Accepted Answer

Consider using more_itertools.unzip:

>>> from more_itertools import unzip
>>> original = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
>>> [list(x) for x in unzip(original)]
[['a', 'b', 'c', 'd'], [1, 2, 3, 4]]

score 2 · Accepted Answer

While numpy arrays and pandas may be preferrable, this function imitates the behavior of zip(*args) when called as unzip(args).

Allows for generators, like the result from zip in Python 3, to be passed as args as it iterates through values.

def unzip(items, cls=list, ocls=tuple):
    """Zip function in reverse.

    :param items: Zipped-like iterable.
    :type  items: iterable

    :param cls: Container factory. Callable that returns iterable containers,
        with a callable append attribute, to store the unzipped items. Defaults
        to ``list``.
    :type  cls: callable, optional

    :param ocls: Outer container factory. Callable that returns iterable
        containers. with a callable append attribute, to store the inner
        containers (see ``cls``). Defaults to ``tuple``.
    :type  ocls: callable, optional

    :returns: Unzipped items in instances returned from ``cls``, in an instance
        returned from ``ocls``.
    """
    # iter() will return the same iterator passed to it whenever possible.
    items = iter(items)

    try:
        i = next(items)
    except StopIteration:
        return ocls()

    unzipped = ocls(cls([v]) for v in i)

    for i in items:
        for c, v in zip(unzipped, i):
            c.append(v)

    return unzipped

To use list cointainers, simply run unzip(zipped), as

unzip(zip(["a","b","c"],[1,2,3])) == (["a","b","c"],[1,2,3])

To use deques, or other any container sporting append, pass a factory function.

from collections import deque

unzip([("a",1),("b",2)], deque, list) == [deque(["a","b"]),deque([1,2])]

(Decorate cls and/or main_cls to micro manage container initialization, as briefly shown in the final assert statement above.)

score 1 · Accepted Answer

因为它返回元组（并且可以使用大量内存），所以zip(*zipped)对我来说，这个技巧似乎比有用更聪明。

这是一个实际上会为您提供 zip 倒数的函数。

def unzip(zipped):
    """Inverse of built-in zip function.
    Args:
        zipped: a list of tuples

    Returns:
        a tuple of lists

    Example:
        a = [1, 2, 3]
        b = [4, 5, 6]
        zipped = list(zip(a, b))

        assert zipped == [(1, 4), (2, 5), (3, 6)]

        unzipped = unzip(zipped)

        assert unzipped == ([1, 2, 3], [4, 5, 6])

    """

    unzipped = ()
    if len(zipped) == 0:
        return unzipped

    dim = len(zipped[0])

    for i in range(dim):
        unzipped = unzipped + ([tup[i] for tup in zipped], )

    return unzipped

score 1 · Accepted Answer

以前的答案都没有有效地提供所需的输出，即列表的元组，而不是元组的列表。对于前者，您可以使用with 。区别如下：tuplemap

res1 = list(zip(*original))              # [('a', 'b', 'c', 'd'), (1, 2, 3, 4)]
res2 = tuple(map(list, zip(*original)))  # (['a', 'b', 'c', 'd'], [1, 2, 3, 4])

此外，大多数以前的解决方案都假定 Python 2.7，其中zip返回列表而不是迭代器。

对于 Python 3.x，您需要将结果传递给函数，例如list或tuple以耗尽迭代器。对于内存高效的迭代器，您可以省略外部list并tuple调用相应的解决方案。

score 1 · Accepted Answer

虽然zip(*seq)非常有用，但它可能不适合非常长的序列，因为它会创建要传入的值的元组。例如，我一直在使用具有超过一百万个条目的坐标系，并发现它的创建速度明显更快直接序列。

一个通用的方法是这样的：

from collections import deque
seq = ((a1, b1, …), (a2, b2, …), …)
width = len(seq[0])
output = [deque(len(seq))] * width # preallocate memory
for element in seq:
    for s, item in zip(output, element):
        s.append(item)

但是，取决于你想对结果做什么，收集的选择可能会产生很大的不同。在我的实际用例中，使用集合而不使用内部循环明显比所有其他方法快。

而且，正如其他人所指出的，如果您使用数据集执行此操作，则改用 Numpy 或 Pandas 集合可能是有意义的。

python - 转置/解压缩功能（zip的倒数）？

12 回答 12

Related

Reference