无需导入,只要提供对象列表或字符串即可;任何带有var[indexing]
. 经测试python 3.6
# This will create windows with all but 1 overlap
def ngrams_list(a_list, window_size=5, skip_step=1):
return list(zip(*[a_list[i:] for i in range(0, window_size, skip_step)]))
for 循环本身就创建了这个a_list
字母表(如图所示window = 5
,OP 想要window=2
:
['ABCDEFGHIJKLMNOPQRSTUVWXYZ',
'BCDEFGHIJKLMNOPQRSTUVWXYZ',
'CDEFGHIJKLMNOPQRSTUVWXYZ',
'DEFGHIJKLMNOPQRSTUVWXYZ',
'EFGHIJKLMNOPQRSTUVWXYZ']
zip(*result_of_for_loop)
将收集所有完整的垂直列作为结果。如果你想要少于一个重叠:
# You can sample that output to get less overlap:
def sliding_windows_with_overlap(a_list, window_size=5, overlap=2):
zip_output_as_list = ngrams_list(a_list, window_size)])
return zip_output_as_list[::overlap+1]
使用它跳过以&overlap=2
开头的列,并选择B
C
D
[('A', 'B', 'C', 'D', 'E'),
('D', 'E', 'F', 'G', 'H'),
('G', 'H', 'I', 'J', 'K'),
('J', 'K', 'L', 'M', 'N'),
('M', 'N', 'O', 'P', 'Q'),
('P', 'Q', 'R', 'S', 'T'),
('S', 'T', 'U', 'V', 'W'),
('V', 'W', 'X', 'Y', 'Z')]
编辑:看起来这与@chmullig 提供的类似,带有选项