python - Python OrderedSet.issuperset() 中的意外行为

Question

我有两个 OrderedSet，我正在尝试检查一个是否在另一个的子集中 - 元素及其顺序都很重要。然而，orderedset 包给了我奇怪的结果。

>>> import orderedset
>>> a = orderedset.OrderedSet([433, 316, 259])
>>> b = orderedset.OrderedSet([433, 316, 69])
>>> a.issuperset(b)
True

这对我来说没有任何意义，因为b包含一个69绝对不在a. 为什么是aasuperset呢b？

但是，当我尝试这个时：

>>> c = orderedset.OrderedSet([1, 2, 3])
>>> d = orderedset.OrderedSet([1, 2, 4])
>>> c.issuperset(d)
False

这种行为对我来说似乎不一致：为什么OrderedSet中值的选择 -[433, 316, 259]与- 会影响?[1, 2, 3]issuperset()

也许有更好的方法来做到这一点？我需要知道其中的元素b是否包含在a相同的顺序中。意思是，如果

a = OrderedSet([433, 316, 259])

a我正在该集合中寻找以与( )相同的起始值开头的部分匹配项433。这就是我想要的：

OrderedSet([433, 316, 259])
OrderedSet([433, 316]])
OrderedSet([433])

并不是：

OrderedSet([433, 259])
OrderedSet([316, 259])
OrderedSet([433, 316, 69])
OrderedSet([433, 259, 316])
OrderedSet([259, 433])
...

基本上，如果这真的令人困惑 - 我有一个有序集，我试图在值和它们的顺序方面找到部分匹配。

score 4 · Accepted Answer

大概你正在使用这个第三方模块，因为 Python 没有内置的有序集。

快速浏览github 上的源代码表明该issuperset函数实现为

def issuperset(self, other):
    return other <= self

看看如何为orderedset定义小于或等于运算符：

def __le__(self, other):
    if isinstance(other, _OrderedSet):
        return len(self) <= len(other) and list(self) <= list(other)

所以本质上，当比较两个有序集合时，它们首先转换为列表，然后使用 Python 内置<=来比较两个列表。当您使用比较两个列表时<=，它类似于词汇字符串比较，这意味着它比较两个列表的匹配索引。

根据他们的实现，[433, 316, 259]是[433, 316, 69]（第一个列表的所有匹配索引大于或等于第二个列表）[433, 316, 259]的超集，而不会是[433, 316, 260]（第二个列表在其最后一个索引中的值大于第一个列表）。

他们可能想写的是

return len(self) <= len(other) and set(self) <= set(other)

它将使用<=内置 Python 集的定义，并正确测试子集和超集。

1 回答 1