这个算法已经逃离了我一段时间了。假设我得到了字符串“cccaatt”。我正在尝试生成每个重复字母子串的所有可能变体。例如,“cccaatt”作为输入将返回:
猫, 猫, caat, caatt, ccat, ccatt, ccaat, ccaatt, cccat, cccatt, cccaat, cccaatt
结果的顺序无关紧要,只要它返回所有结果即可。一般来说,输入是一个字符串,由g组重复的字母组成,每组k_n个字母长。
我的直觉是,这是一个递归算法,但它的确切结构一直很难理解。
这个算法已经逃离了我一段时间了。假设我得到了字符串“cccaatt”。我正在尝试生成每个重复字母子串的所有可能变体。例如,“cccaatt”作为输入将返回:
猫, 猫, caat, caatt, ccat, ccatt, ccaat, ccaatt, cccat, cccatt, cccaat, cccaatt
结果的顺序无关紧要,只要它返回所有结果即可。一般来说,输入是一个字符串,由g组重复的字母组成,每组k_n个字母长。
我的直觉是,这是一个递归算法,但它的确切结构一直很难理解。
如果您存储每个字母的字母表和最大出现次数(正如评论中所提到的),您可以这样做:
function variations(letter_type, current string) {
if (letter_type is in the alphabet) {
while (string has fewer than the max amount of that letter) {
add one of that letter to current string
variations(next letter, current string)
}
} else {
print current string // since there are no more letters to add
}
}
在 Java 中:
public class Variations {
static String[] alphabet = {"c","a","t"};
static int[] maximums = {3, 2, 2};
public static void main(String[] args) {
variations(0, "");
}
public static void variations(int letter_type, String curr) {
if (letter_type < alphabet.length) {
for (int i = 1; i <= maximums[letter_type]; i++) {
curr += alphabet[letter_type];
variations(letter_type+1, curr);
}
} else {
System.out.println(curr);
}
}
}
将字符串分解为数字列表和重复次数,即“cccaatt” => [(c,3), (a,2), (t,2)]。然后可以递归地定义问题。
Let xs = [(a_1, n_1), (a_2, n_2), (a_3, n_3), ... (a_k, n_k)]
define Perm(xs):
if len(xs) == 1:
return all length variations of xs
else:
return every sequence in Perm(x[:-1]) appended with one or more from x[-1]
我很快就会有一个 python 示例。
> perm("cccaatt")
> ['cat', 'ccat', 'cccat', 'caat', 'ccaat', 'cccaat', 'catt', 'ccatt', 'cccatt', 'caatt', 'ccaatt', 'cccaatt']
附上代码
def perm(xs):
if not xs:
return []
# group them into the correct format, probably should have used groupby + zip
l = [(xs[0],1)]
for x in xs[1:]:
last, num = l[-1]
if last == x:
l[-1] = (last, num+1)
else:
l.append((x, 1))
# print(l)
print(recurse(l))
# this is where the real work is done.
def recurse(xs):
if len(xs) == 1:
return [ xs[0][0] * x for x in range(1, xs[0][1] + 1) ]
prev = recurse(xs[:-1])
char, num = xs[-1]
return [ y + x * char for x in range(1,num + 1) for y in prev ]
Python itertools 模块具有强大的工具来分组,然后迭代导致以下程序的组成员。
我已经展示了一些中间结果并使用 pprint 模块来漂亮地打印答案:
Python 2.7.3 (default, Aug 1 2012, 05:16:07)
[GCC 4.6.3] on linux2
Type "copyright", "credits" or "license()" for more information.
>>> import itertools
>>> instring = "cccaatt"
>>> [(x[0], list(x[1])) for x in itertools.groupby(instring)]
[('c', ['c', 'c', 'c']), ('a', ['a', 'a']), ('t', ['t', 't'])]
>>> xx = [list(x[0]*n for n in range(1, len(list(x[1]))+1)) for x in itertools.groupby(instring)]
>>> xx
[['c', 'cc', 'ccc'], ['a', 'aa'], ['t', 'tt']]
>>> list(itertools.product(*xx))
[('c', 'a', 't'), ('c', 'a', 'tt'), ('c', 'aa', 't'), ('c', 'aa', 'tt'), ('cc', 'a', 't'), ('cc', 'a', 'tt'), ('cc', 'aa', 't'), ('cc', 'aa', 'tt'), ('ccc', 'a', 't'), ('ccc', 'a', 'tt'), ('ccc', 'aa', 't'), ('ccc', 'aa', 'tt')]
>>> from pprint import pprint as pp
>>> pp(list(itertools.product(*xx)))
[('c', 'a', 't'),
('c', 'a', 'tt'),
('c', 'aa', 't'),
('c', 'aa', 'tt'),
('cc', 'a', 't'),
('cc', 'a', 'tt'),
('cc', 'aa', 't'),
('cc', 'aa', 'tt'),
('ccc', 'a', 't'),
('ccc', 'a', 'tt'),
('ccc', 'aa', 't'),
('ccc', 'aa', 'tt')]
>>>
或者作为一个函数:
>>> def stringexpand(instring):
xx = [list(x[0]*n for n in range(1, len(list(x[1]))+1)) for x in itertools.groupby(instring)]
return list(itertools.product(*xx))
>>> pp(stringexpand("cccaatt"))
[('c', 'a', 't'),
('c', 'a', 'tt'),
('c', 'aa', 't'),
('c', 'aa', 'tt'),
('cc', 'a', 't'),
('cc', 'a', 'tt'),
('cc', 'aa', 't'),
('cc', 'aa', 'tt'),
('ccc', 'a', 't'),
('ccc', 'a', 'tt'),
('ccc', 'aa', 't'),
('ccc', 'aa', 'tt')]
>>>
你似乎需要从它们的部分连接起来的字符串。这可以在这个轻微的 mod 中完成:
def stringexpand(instring):
xx = [list(x[0]*n for n in range(1, len(list(x[1]))+1)) for x in itertools.groupby(instring)]
return [''.join(parts) for parts in itertools.product(*xx)]
返回:
['cat',
'catt',
'caat',
'caatt',
'ccat',
'ccatt',
'ccaat',
'ccaatt',
'cccat',
'cccatt',
'cccaat',
'cccaatt']