2

给定两个排序的向量ab,找出所有向量,它们是 的a和和 的一些排列b,并且一旦排序是唯一的。

您可以通过以下方式创建寻找的向量之一:

  • 取向量a和向量的排列b
  • 把它们加在一起c[i]=a[i]+b[i]
  • 排序c

我有兴趣找到b产生整个唯一c向量集的 -permutations 集。

示例 0 : a='ccdd'andb='xxyy'
给出求和向量:'cycydxdx', 'cxcxdydy', 'cxcydxdy'.
请注意b:'xyxy'和的排列'yxyx'是相等的,因为在这两种情况下,“box c”和“box d”都恰好得到 one'x'和 one 'y'

我想这类似于M将球放入M盒子中(每个盒子一个),其中一些球和盒子组是相同的。
更新:给定一个字符串a='aabbbcdddd'b='xxyyzzttqq'您的问题将是 4 个盒子中的 10 个球。有 4 个大小为 2、3、1 和 4 的不同盒子。球是成对的,无法区分。

示例 1:给定字符串是a='xyy'b='kkd'
可能的解决方案:'kkd', 'dkk'.
原因:我们看到所有唯一的排列b'kkd'和。然而,由于我们的限制,前两个排列被认为是相等的,因为差异映射到string中的相同 char 的索引。'kdk''dkk''y'a

示例 2:给定字符串是a='xyy'and b='khd'
可能的解决方案:'khd', 'dkh', 'hkd'.

示例 3:给定字符串是a='xxxx'b='khhd'
可能的解决方案:'khhd'.

我可以使用Wikipedia/Permutationb中描述的 Narayana Pandita 算法解决生成唯一候选排列的问题。 第二部分接缝更难。我最好的办法是将两个字符串成对加入一个列表,对其进行排序并将其用作查找集中的键。(+连接→<code>'xh','xd' 排序→<code>'xd','xh')。
'xx''hd'

由于 myM通常非常大,并且字符串中的相似性很常见,因此我目前生成的b排列方式比实际通过 set 过滤器要多得多。我希望有一个算法直接生成正确的算法。欢迎任何改进。

4

2 回答 2

2

To generate k-combinations of possibly repeated elements (multiset), the following could be useful: A Gray Code for Combinations of a Multiset (1995).

For a recursive solution you try the following:

Count the number of times each character appears. Say they are x1 x2 ... xm, corresponding to m distinct characters.

Then you need to find all possible ordered pairs (y1 y2 ... ym) such that

0 <= yi <= xi

and Sum yi = k.

Here yi is the number of times character i appears.

The idea is, fix the number of times char 1 appears (y1). Then recursively generate all combinations of k-y1 from the remaining.

psuedocode:

List Generate (int [] x /* array index starting at 1*/, 
               int k /* size of set */) {

    list = List.Empty;

    if (Sum(x) < k) return list;

    for (int i = 0; i <= x[1], i++) {

        // Remove first element and generate subsets of size k-i.

        remaining = x.Remove(1);

        list_i = Generate(remaining, k-i);

        if (list_i.NotEmpty()) {

            list = list + list_i;    

        } else {

            return list;
        }

    }

    return list;
}

PRIOR TO EDITS:

If I understood it correctly, you need to look at string a, see the symbols that appear exactly once. Say there are k such symbols. Then you need to generate all possible permutations of b, which contain k elements and map to those symbols at the corresponding positions. The rest you can ignore/fill in as you see fit.

I remember posting C# code for that here: How to find permutation of k in a given length?

I am assuming xxyy will give only 1 unique string and the ones that appear exactly once are the 'distinguishing' points.

Eg in case of a=xyy, b=add

distinguishing point is x

So you select permuations of 'add' of length 1. Those gives you a and d.

Thus add and dad (or dda) are the ones you need.

For a=xyyz b=good

distinguishing points are x and z

So you generate permutations of b of length 2 giving

go
og
oo
od
do
gd
dg

giving you 7 unique permutations.

Does that help? Is my understanding correct?

于 2010-06-18T23:39:33.057 回答
0

Ok, I'm sorry I never was able to clearly explain the problem, but here is a solution.

We need two functions combinations and runvector(v). combinations(s,k) generates the unique combinations of a multiset of a length k. For s='xxyy' these would be ['xx','xy','yy']. runvector(v) transforms a multiset represented as a sorted vector into a more simple structure, the runvector. runvector('cddeee')=[1,2,3].

To solve the problem, we will use recursive generators. We run through all the combinations that fits in box1 and the recourse on the rest of the boxes, banning the values we already chose. To accomplish the banning, combinations will maintain a bitarray across of calls.

In python the approach looks like this:

def fillrest(banned,out,rv,b,i):
    if i == len(rv):
        yield None
        return
    for comb in combinations(b,rv[i],banned):
        out[i] = comb
        for rest in fillrest(banned,out,rv,b,i+1):
            yield None

def balls(a,b):
    rv = runvector(a)
    banned = [False for _ in b]
    out = [None for _ in rv]
    for _ in fill(out,rv,0,b,banned):
        yield out[:]

>>> print list(balls('abbccc','xyyzzz'))
[['x', 'yy', 'zzz'],
 ['x', 'yz', 'yzz'],
 ['x', 'zz', 'yyz'],
 ['y', 'xy', 'zzz'],
 ['y', 'xz', 'yzz'],
 ['y', 'yz', 'xzz'],
 ['y', 'zz', 'xyz'],
 ['z', 'xy', 'yzz'],
 ['z', 'xz', 'yyz'],
 ['z', 'yy', 'xzz'],
 ['z', 'yz', 'xyz'],
 ['z', 'zz', 'xyy']]

The output are in 'box' format, but can easily be merged back to simple strings: 'xyyzzzz', 'xyzyzz'...

于 2010-06-20T20:01:02.743 回答