subset(("A","b","C","D"))
应该产生:
("A","b","C"),
("b","C","D"),
("A","b"),
("b","C"),
("C","D"),
("A",),
("b",),
("C",),
("D",)
滑动窗户可能很难。迭代缩小或增长的窗口是双重的。
首先列出要解决的步骤,然后创建一个遵循这些步骤的函数:
- 从最大的窗口大小开始(比总长度小一,来自示例代码)。
- 然后计算覆盖数据集所需的窗口数。
- 然后对于每个窗口,您可以重复使用该数字作为起始索引,并且您需要将起始索引添加到窗口大小以确定每个窗口停止的位置:
结果函数:
def subset(data):
total_length = len(data)
for window_length in range(total_length - 1, 0, -1): # biggest first
n_windows = total_length - window_length + 1
for each_window in range(n_windows):
start = each_window
stop = start + window_length
yield data[start:stop]
样本数据:
data = ("A","b","C","D")
现在,调用subset
ondata
返回一个生成器,如果我们传递给list
,它会具体化结果:
>>> subset(data)
<generator object subset at 0x7fbc3d7f3570>
>>> list(subset(data))
[('A', 'b', 'C'), ('b', 'C', 'D'), ('A', 'b'), ('b', 'C'), ('C', 'D'), ('A',), ('b',), ('C',), ('D',)]
双端队列解决方案:
我对使用双端队列(来自集合模块)作为滚动窗口的想法很着迷,并决定演示一下:
import collections
import pprint
def shrinking_windows(iterable):
'''
Given an ordered iterable (meaningless for unordered ones)
return a list of tuples representing each possible set
of consecutive items from the original list. e.g.
shrinking_windows(['A', 'b', 'c']) returns
[('A', 'b', 'c'), ('A', 'b'), ('b', 'c') ...] but not ('A', 'c')
'''
window_generator = range(len(iterable), 0, -1)
results = []
for window in window_generator:
d = collections.deque((), maxlen=window)
for i in iterable:
d.append(i)
if len(d) == window:
results.append(tuple(d))
return results
pprint.pprint(shrinking_windows('AbCd'))
很好地返回:
[('A', 'b', 'C', 'd'),
('A', 'b', 'C'),
('b', 'C', 'd'),
('A', 'b'),
('b', 'C'),
('C', 'd'),
('A',),
('b',),
('C',),
('d',)]