我有一个看起来像这样的字符串:
string = 'TTHHTHHTHHHHTTHHHTTT'
我如何计算字符串中的运行次数以便得到,
5 次 T 运行和 4 次 H 运行
You can use a combination of itertools.groupby
and collections.Counter
:
>>> from itertools import groupby
>>> from collections import Counter
>>> strs = 'TTHHTHHTHHHHTTHHHTTT'
>>> Counter(k for k, g in groupby(strs))
Counter({'T': 5, 'H': 4})
itertools.groupby
groups the item based on a key.(by default key is the items in the iterable itself)
>>> from pprint import pprint
>>> pprint([(k, list(g)) for k, g in groupby(strs)])
[('T', ['T', 'T']),
('H', ['H', 'H']),
('T', ['T']),
('H', ['H', 'H']),
('T', ['T']),
('H', ['H', 'H', 'H', 'H']),
('T', ['T', 'T']),
('H', ['H', 'H', 'H']),
('T', ['T', 'T', 'T'])]
Here first item is the key(k
) based on which the items were grouped and list(g)
is the group related to that key. As we're only interested in key
part, so, we can pass k
to collections.Counter
to get the desired answer.
对于品种,一种re
基于方法
import re
letters = ['H', 'T']
matches = re.findall(r'({})\1*'.format('|'.join(letters)), 'TTHHTHHZTHHHHTTHHHTTT')
print matches
['T', 'H', 'T', 'H', 'T', 'H', 'T', 'H', 'T']
[(letter, matches.count(letter)) for letter in letters]
[('H', 4), ('T', 5)]