python - 如何在 Python 中的 lst.count((x, not y)) 中组合逻辑门 NOT

Question

我正在尝试从元组列表中构建列联表。该列表如下所示：

lst = [('a', 'bag'), ('a', 'bag'), ('a', 'bag'), ('a', 'cat'), ('a', 'pen'), ('that', 'house'), ('my', 'car'), ('that', 'bag'), ('this', 'bag')]

给定一个元组，比如说('a', 'bag')，必须解决 4 件事：

a = lst.count(('a', 'bag'))这是3。

b是所有元组的计数tuple[0] == 'a' and tuple[1] != 'bag'，它是 2: ('a', 'cat'), ('a', 'pen')。

当我尝试

lst.count(('a', not 'bag'))我明白0了，虽然它应该是2。-----1

c是所有元组的计数，其中tuple[0] != 'a' and tuple[1] == 'bag'. 在这种情况下，('that', 'bag'), ('this', 'bag')。但是当我尝试

lst.count((not 'a', 'bag'))我明白0了，虽然它应该是2。-----2

d是所有元组的计数，其中tuple[0] !== 'a' and tuple[1] != 'bag和可以很容易地从中获得len(lst) - a。

我的问题：有没有办法not在lst.count((x, not y))or中组合逻辑门lst.count((not x, y))？如果没有，您能否向我建议如何在没有循环的情况下进行锻炼b，c因为复杂性2(N*N)非常昂贵。

非常感谢您的帮助！

score 1 · Accepted Answer

你不能not以count这种方式使用。如果这样做lst.count(('a', not 'bag'))，则将not 'bag'首先评估False，因此您实际上是在计数('a', False)。

相反，您可以使用sum条件，比较元组的第一个和第二个元素：

>>> lst = [('a', 'bag'), ('a', 'bag'), ('a', 'bag'), ('a', 'cat'), ('a', 'pen'), ('that', 'house'), ('my', 'car'), ('that', 'bag'), ('this', 'bag')]
>>> lst.count(('a', 'bag'))
3
>>> sum(1 for a,b in lst if a == 'a' and b == 'bag')
3
>>> sum(1 for a,b in lst if a == 'a' and b != 'bag')
2
>>> sum(1 for a,b in lst if a != 'a' and b == 'bag')
2

score 1 · Accepted Answer

from collections import Counter, defaultdict

lst = [('a', 'bag'), ('a', 'bag'), ('a', 'bag'), ('a', 'cat'), ('a', 'pen'), ('that', 'house'), ('my', 'car'), ('that', 'bag'), ('this', 'bag')]
# counting edges in 2 directed graphs
dct_a = defaultdict(Counter)
dct_b = defaultdict(Counter)

for a, b in lst:
    # dct_x[x][0] represents total count of occurrences of x in first position.
    dct_a[a][b] += 1
    dct_a[a][0] += 1

    dct_b[b][a] += 1
    dct_b[b][0] += 1

def compute_coocurrence(a, b):
    out = {}
    out['both_occur']  = dct_a[a][b]
    out['a_but_not_b'] = dct_a[a][0] - dct_a[a][b]
    out['b_but_not_a'] = dct_b[b][0] - dct_b[b][a]
    return out

print compute_coocurrence('a', 'bag')

Pythoncollections提供了 2 个很好的数据结构，可以帮助您解决问题。这种方法构造了 2 个字典，它们分别由元组中的第一个和第二个索引索引。所以dct_a['a']保存了所有 b 的共现（与 a）的计数。我相信这表明了一个 O(n) 两遍算法。

{'both_occur': 3, 'b_but_not_a': 2, 'a_but_not_b': 2}

score 0 · Accepted Answer

您可以定义一个函数，一次计算所有 4 种组合，如下所示

>>> def my_count(iterable,a,b):
        both    = 0
        a_not_b = 0
        not_a_b = 0
        neither = 0 
        for x,y in iterable:
            if x == a and y == b:
                both += 1
            if x == a and y!= b:
                a_not_b += 1
            if x != a and y == b:
                not_a_b += 1
            if x!= a and y!= b:
                neither += 1
        return both, a_not_b, not_a_b, neither

>>> lst = [('a', 'bag'), ('a', 'bag'), ('a', 'bag'), ('a', 'cat'), ('a', 'pen'), ('that', 'house'), ('my', 'car'), ('that', 'bag'), ('this', 'bag')]
>>> my_count(lst,"a","bag")
(3, 2, 2, 2)
>>>

并使其更详细，您可以添加这样的名称元组

>>> from collections import namedtuple
>>> CountTuple = namedtuple("CountTuple","both a_not_b not_a_b neither")
>>> def my_count(iterable,a,b):
        #same as before 
        ...
        return CountTuple(both,a_not_b,not_a_b,neither)

>>> result = my_count(lst,"a","bag")
>>> result
CountTuple(both=3, a_not_b=2, not_a_b=2, neither=2)
>>> result.both
3
>>> result.a_not_b
2
>>> result.not_a_b
2
>>> result.neither
2
>>>

python - 如何在 Python 中的 lst.count((x, not y)) 中组合逻辑门 NOT

3 回答 3

Related

Reference