algorithm - 列出所有唯一数字排列的算法包含重复项

Question

问题是：给定一组可能包含重复的数字，返回所有唯一的排列。

天真的方法是使用一个集合（在 C++ 中）来保存排列。这需要O ( n ! × log( n !)) 时间。有更好的解决方案吗？

score 5 · Accepted Answer

最简单的方法如下：

对列表进行排序：O(n lg n)
排序后的列表是第一个排列
从前一个排列重复生成“下一个”排列：O(n! * <complexity of finding next permutaion>)

步骤 3 可以通过将下一个排列定义为如果排列列表已排序则直接出现在当前排列之后的排列来完成，例如：

1, 2, 2, 3
1, 2, 3, 2
1, 3, 2, 2
2, 1, 2, 3
2, 1, 3, 2
2, 2, 1, 3
...

查找下一个字典排列是 O(n)，在 Wikipedia 页面上以字典顺序生成标题下的排列给出了简单的描述。如果您有野心，您可以使用简单的更改在 O(1) 中生成下一个排列

score 2 · Accepted Answer

1）回溯/递归搜索的一些变化通常可以解决这类问题。给定一个返回 (n-1) 个对象的所有排列列表的函数，生成一个包含 n 个对象的所有排列的列表，如下所示：对于列表中的每个元素，在所有可能的位置插入第 n 个对象，检查重复项。这不是特别有效，但它通常会为这类问题生成简单的代码。

2) 参见维基百科http://en.wikipedia.org/wiki/Permutation#Generation_in_lexicographic_order

3) 学者们在这方面的细节上花了很多时间。请参阅 Knuth Vol 4A 的第 7.2.1.2 节 - 这是一本大型精装书，在亚马逊上有以下简短目录：

第 7 章：组合搜索 1

7.1: 零和一 47

7.2：产生所有可能性 281

score 1 · Accepted Answer

您应该阅读我关于这种排列（除其他外）的博客文章以获取更多背景信息 - 并点击那里的一些链接。

这是我的词典排列生成器的一个版本，它按照 Steinhaus-Johnson-Trotter 排列生成器的生成顺序制作，按要求执行：

def l_perm3(items):
    '''Generator yielding Lexicographic permutations of a list of items'''
    if not items:
        yield []
    else:
        dir = 1
        new_items = []
        this = [items.pop()]
        for item in l_perm3(items):
            lenitem = len(item)
            try:
                # Never insert 'this' above any other 'this' in the item 
                maxinsert = item.index(this[0])
            except ValueError:
                maxinsert = lenitem
            if dir == 1:
                # step down
                for new_item in [item[:i] + this + item[i:] 
                                 for i in range(lenitem, -1, -1)
                                 if i <= maxinsert]:
                    yield new_item                    
            else:    
                # step up
                for new_item in [item[:i] + this + item[i:] 
                                 for i in range(lenitem + 1)
                                 if i <= maxinsert]:
                    yield new_item                    
            dir *= -1

from math import factorial
def l_perm_length(items):
    '''\
    Returns the len of sequence of lexicographic perms of items. 
    Each item of items must itself be hashable'''
    counts = [items.count(item) for item in set(items)]
    ans = factorial(len(items))
    for c in counts:
        ans /= factorial(c)
    return ans

if __name__ == '__main__':
    n = [0, 1, 2, 2, 2]
    print '\nLexicograpic Permutations of %i items: %r' % (len(n), n)
    for i, x in enumerate(l_perm3(n[:])):
        print('%3i %r' % (i, x))
    assert i+1 == l_perm_length(n), 'Generated number of permutations is wrong'

上述程序的输出例如如下：

Lexicograpic Permutations of 5 items: [0, 1, 2, 2, 2]
  0 [0, 1, 2, 2, 2]
  1 [0, 2, 1, 2, 2]
  2 [2, 0, 1, 2, 2]
  3 [2, 0, 2, 1, 2]
  4 [0, 2, 2, 1, 2]
  5 [2, 2, 0, 1, 2]
  6 [2, 2, 0, 2, 1]
  7 [0, 2, 2, 2, 1]
  8 [2, 0, 2, 2, 1]
  9 [2, 2, 2, 0, 1]
 10 [2, 2, 2, 1, 0]
 11 [2, 1, 2, 2, 0]
 12 [1, 2, 2, 2, 0]
 13 [2, 2, 1, 2, 0]
 14 [2, 2, 1, 0, 2]
 15 [1, 2, 2, 0, 2]
 16 [2, 1, 2, 0, 2]
 17 [2, 1, 0, 2, 2]
 18 [1, 2, 0, 2, 2]
 19 [1, 0, 2, 2, 2]

score 0 · Accepted Answer

这是我在思考了我是如何手工写出排列并将该方法放入代码中后发明的，它更短更好：

def incv(prefix,v):
  list = []
  done = {}
  if v:
    for x in xrange(len(v)):
      if v[x] not in done:
        done[v[x]] = 1
        list = list + incv(prefix+v[x:x+1],v[:x] + v[x+1:])
  else:
    list.append(''.join(prefix))
  return list

def test(test_string,lex_ord=False):
  if lex_ord:
    test_string = [x for x in test_string]
    test_string.sort()
  p = incv([],[x for x in test_string])
  if lex_ord:
    try_p = p[::]
    try_p.sort()
    print "Sort methods equal ?", try_p == p
  print 'All', ','.join(p), "\n", test_string, "gave", len(p), "permutations"

if __name__ == '__main__':
  import sys
  test(sys.argv[1],bool(sys.argv[2] if len(sys.argv) > 2 else False))

笔记

incv递增排列向量以找到所有排列向量。它还可以正确处理重复的字母。
test打印出测试字符串的所有排列及其计数。它还确保如果您请求按字典顺序排序，则 sort before 和 sort after 方法是相同的。这应该是 True ，因为原始字符串是有序的，并且增量置换函数将字符串转换为给定字母表的下一个字典字符串。

可以通过以下方式在命令提示符下运行此脚本：

python script.py [test_string] [optional anything to use lexicographic ordering]

score 0 · Accepted Answer

递归版本。这计算 n!/(m*k!) (m 个字符集，k 个重复字符集：

#include<iostream>
#include<cstring>

using namespace std;

const int MAX_CHARS_STRING=100;
int CURRENT_CHARS=0;
char STR[MAX_CHARS_STRING];

void myswap(int i, int j){
    char c=STR[i];STR[i]=STR[j];STR[j]=c;
}

bool IstobeExecuted(int start,int position){
    if(start==position)
        return true;
    for(int i=position-1;i>=start;i--){
        if(STR[i]==STR[position])
            return false;
    }
    return true;
}

void Permute(int start, int end,int& perm_no){
    if(end-start<=1){
        if(STR[end]==STR[start]){
            cout<<perm_no++<<") "<<STR<<endl;
            return;
        }
        cout<<perm_no++<<") "<<STR<<endl;
        myswap(start, end);
        cout<<perm_no++<<") "<<STR<<endl;
        myswap(start,end);
        return;
    }
    for(int i=start; i<=end;i++){
        if(!IstobeExecuted(start,i)){
            continue;
        }
        myswap(start,i);
        Permute(start+1,end,perm_no);
        myswap(start,i);
    }
}


int main(){
    cin>>STR;int num=1;
    Permute(0,strlen(STR)-1,num);
    return 0;
}

希望这可以帮助

score 0 · Accepted Answer

@verdesmarald 解决方案的一个简单而简短的 C++ 实现：

vector<vector<int>> permuteUnique(vector<int>& nums) {

    vector<vector<int>> res;
    const auto begin = nums.begin();
    const auto end = nums.end();
    std::sort(begin, end);

    do
    {
        res.push_back(nums);
    } 
    while (std::next_permutation(begin, end));

    return res;
}

我认为时间复杂度是：n*log(n) + m * ComplexityOf(next_permutation) 其中 n 是元素的总数，m 是唯一元素，next_permutation 的复杂度是 O(1) 摊销的。或者他们说：std::next_permutation 的摊销复杂性？

score 0 · Accepted Answer

我稍微改进了Paddy3118 的解决方案，所以它现在是非递归的、惰性求值的（完全基于生成器）并且速度提高了大约 30%。

def _handle_item(xs, d, t):
    l = len(xs)

    try:
        m = xs.index(t)
    except ValueError:
        m = l

    if d:
        g = range(l, -1, -1)
    else:
        g = range(l + 1)

    q = [t]
    for i in g:
        if i <= m:
            yield xs[:i] + q + xs[i:]

def _chain(xs, t):
    d = True

    for x in xs:
        yield from _handle_item(x, d, t)

        d = not d

def permutate(items):
    xs = [[]]

    for t in items:
        xs = _chain(xs, t)

    yield from xs

PS 我注意到 Paddy3118 也让他的实现使用了生成器，而我一直在努力反对博客文章中的实现，这更加内存密集。无论如何我都会发布这个，因为这个版本可能被认为更干净。

algorithm - 列出所有唯一数字排列的算法包含重复项

7 回答 7

Related

Reference