是否有 Pythonic 等价于 Ruby 的#each_cons
?
在 Ruby 中,您可以这样做:
array = [1,2,3,4]
array.each_cons(2).to_a
=> [[1,2],[2,3],[3,4]]
是否有 Pythonic 等价于 Ruby 的#each_cons
?
在 Ruby 中,您可以这样做:
array = [1,2,3,4]
array.each_cons(2).to_a
=> [[1,2],[2,3],[3,4]]
我认为没有,我查看了内置模块itertools
,这是我期望的。您可以简单地创建一个:
def each_cons(xs, n):
return [xs[i:i+n] for i in range(len(xs)-n+1)]
对于此类事情,itertools
您应该查看的模块是:
from itertools import tee, izip
def pairwise(iterable):
"s -> (s0,s1), (s1,s2), (s2, s3), ..."
a, b = tee(iterable)
next(b, None)
return izip(a, b)
然后:
>>> list(pairwise([1, 2, 3, 4]))
[(1, 2), (2, 3), (3, 4)]
对于更通用的解决方案,请考虑以下内容:
def split_subsequences(iterable, length=2, overlap=0):
it = iter(iterable)
results = list(itertools.islice(it, length))
while len(results) == length:
yield results
results = results[length - overlap:]
results.extend(itertools.islice(it, length - overlap))
if results:
yield results
这允许任意长度的子序列和任意重叠。用法:
>> list(split_subsequences([1, 2, 3, 4], length=2))
[[1, 2], [3, 4]]
>> list(split_subsequences([1, 2, 3, 4], length=2, overlap=1))
[[1, 2], [2, 3], [3, 4], [4]]
我的列表解决方案(Python2):
import itertools
def each_cons(xs, n):
return itertools.izip(*(xs[i:] for i in xrange(n)))
编辑:itertools.izip
不再使用 Python 3 ,因此您使用 plain zip
:
def each_cons(xs, n):
return zip(*(xs[i:] for i in range(n)))
快速单线:
a = [1, 2, 3, 4]
out = [a[i:i + 2] for i in range(len(a) - 1)]
Python 肯定可以做到这一点。如果你不想那么急切,使用itertool 的islice 和izip。此外,重要的是要记住普通切片将创建一个副本,因此如果内存使用很重要,您还应该考虑 itertool 等价物。
each_cons = lambda l: zip(l[:-1], l[1:])
更新:没关系我在下面的回答,只需使用toolz.itertoolz.sliding_window()
- 它会做正确的事情。
each_cons
对于在序列/生成器长度不足时保留 Ruby 行为的真正惰性实现:
import itertools
def each_cons(sequence, n):
return itertools.izip(*(itertools.islice(g, i, None)
for i, g in
enumerate(itertools.tee(sequence, n))))
例子:
>>> print(list(each_cons(xrange(5), 2)))
[(0, 1), (1, 2), (2, 3), (3, 4)]
>>> print(list(each_cons(xrange(5), 5)))
[(0, 1, 2, 3, 4)]
>>> print(list(each_cons(xrange(5), 6)))
[]
>>> print(list(each_cons((a for a in xrange(5)), 2)))
[(0, 1), (1, 2), (2, 3), (3, 4)]
请注意,在 izip 的参数上使用的元组解包应用于大小为的元组n
(itertools.tee(xs, n)
即“窗口大小”),而不是我们想要迭代的序列。
接近@Blender 的解决方案,但有一个修复:
a = [1, 2, 3, 4]
n = 2
out = [a[i:i + n] for i in range(len(a) - n + 1)]
# => [[1, 2], [2, 3], [3, 4]]
或者
a = [1, 2, 3, 4]
n = 3
out = [a[i:i + n] for i in range(len(a) - n + 1)]
# => [[1, 2, 3], [2, 3, 4]]
from itertools import islice, tee
def each_cons(sequence, n):
return zip(
*(
islice(g, i, None)
for i, g in
enumerate(tee(sequence, n))
)
)
$ ipython
...
In [2]: a_list = [1, 2, 3, 4, 5]
In [3]: list(each_cons(a_list, 2))
Out[3]: [(1, 2), (2, 3), (3, 4), (4, 5)]
In [4]: list(each_cons(a_list, 3))
Out[4]: [(1, 2, 3), (2, 3, 4), (3, 4, 5)]
In [5]: list(each_cons(a_list, 5))
Out[5]: [(1, 2, 3, 4, 5)]
In [6]: list(each_cons(a_list, 6))
Out[6]: []