1

我有四个价值观

age = 23
gender = "M"
city ="Delhi"
religion = "Muslim"

我需要这些由每个组合与空值组合排列,例如 -

23 * * *
23 M * *
23 M Delhi *
23 M Delhi Muslim
* M * *
* M Delhi *
* M Delhi Muslim
* * Delhi *
* * Delhi Muslim
* * * Muslim
* * * *

我需要按列表中升序的维度数排列。因此,具有一个值的组合应该位于顶部。我有大约 30 多个属性,所以我需要一种自动化的方式在 Python 中执行此操作

有任何想法吗 ?

4

3 回答 3

5

以下情况如何:

In [21]: attrib = (23, "M", "Delhi", "Muslim")

In [25]: comb = list(itertools.product(*((a, None) for a in attrib)))

In [26]: comb
Out[26]: 
[(23, 'M', 'Delhi', 'Muslim'),
 (23, 'M', 'Delhi', None),
 (23, 'M', None, 'Muslim'),
 (23, 'M', None, None),
 (23, None, 'Delhi', 'Muslim'),
 (23, None, 'Delhi', None),
 (23, None, None, 'Muslim'),
 (23, None, None, None),
 (None, 'M', 'Delhi', 'Muslim'),
 (None, 'M', 'Delhi', None),
 (None, 'M', None, 'Muslim'),
 (None, 'M', None, None),
 (None, None, 'Delhi', 'Muslim'),
 (None, None, 'Delhi', None),
 (None, None, None, 'Muslim'),
 (None, None, None, None)]

现在,如果我正确理解了您的排序要求,则应该执行以下操作:

In [27]: sorted(comb, key=lambda x:sum(v is not None for v in x))
Out[27]: 
[(None, None, None, None),
 (23, None, None, None),
 (None, 'M', None, None),
 (None, None, 'Delhi', None),
 (None, None, None, 'Muslim'),
 (23, 'M', None, None),
 (23, None, 'Delhi', None),
 (23, None, None, 'Muslim'),
 (None, 'M', 'Delhi', None),
 (None, 'M', None, 'Muslim'),
 (None, None, 'Delhi', 'Muslim'),
 (23, 'M', 'Delhi', None),
 (23, 'M', None, 'Muslim'),
 (23, None, 'Delhi', 'Muslim'),
 (None, 'M', 'Delhi', 'Muslim'),
 (23, 'M', 'Delhi', 'Muslim')]

我用过None你用过的地方*,但使用后者很简单。

当然,对于 30 个属性,您正在查看约 10 亿个组合,因此列表的展平和随后的排序可能不起作用。但是,无论如何,您可以用 10 亿个条目做什么有用的事情?

于 2012-12-04T17:15:05.707 回答
4

NPE 的答案通过在内存中构建子集的完整列表然后对其进行排序来解决问题。这需要 O(2 n ) 空间和 O( n 2  2 n ) 时间。如果这是不可接受的,那么这是一种在 O( n ) 空间和 O( n  2 n ) 时间中生成子集的方法。

from itertools import combinations

def subsets(s, placeholder = None):
    """
    Generate the subsets of `s` in order of size.
    Use `placeholder` for missing elements (default: None).
    """
    s = list(s)
    n = len(s)
    r = range(n)
    for i in range(n + 1):
        for c in combinations(r, i):
            result = [placeholder] * n
            for j in c:
                result[j] = s[j]
            yield result

>>> from pprint import pprint
>>> pprint(list(subsets([23, 'M', 'Delhi', 'Muslim'])))
[[None, None, None, None],
 [23, None, None, None],
 [None, 'M', None, None],
 [None, None, 'Delhi', None],
 [None, None, None, 'Muslim'],
 [23, 'M', None, None],
 [23, None, 'Delhi', None],
 [23, None, None, 'Muslim'],
 [None, 'M', 'Delhi', None],
 [None, 'M', None, 'Muslim'],
 [None, None, 'Delhi', 'Muslim'],
 [23, 'M', 'Delhi', None],
 [23, 'M', None, 'Muslim'],
 [23, None, 'Delhi', 'Muslim'],
 [None, 'M', 'Delhi', 'Muslim'],
 [23, 'M', 'Delhi', 'Muslim']]
于 2012-12-04T17:30:55.223 回答
1

itertools,它有一个方法combinations

于 2012-12-04T17:12:10.750 回答