python - Python - 使用 set.add() 时添加的元素在哪里

Question

我发现当我使用python的set结构的add函数时，元素似乎被添加到我无法弄清楚的位置。

>>> a=set([(0, 2)])
>>> a.add((0,4))
>>> a
set([(0, 2), (0, 4)])
>>> a.add((1,0))
>>> a
set([(1, 0), (0, 2), (0, 4)])
>>> a.add((2,5))
>>> a
set([(2, 5), (1, 0), (0, 2), (0, 4)])
>>> a.add((3,0))
>>> a
set([(3, 0), (2, 5), (1, 0), (0, 2), (0, 4)])
>>> a.add((1,6))
>>> a
set([(3, 0), (0, 2), (1, 6), (0, 4), (2, 5), (1, 0)])

可以看出，有时元素被添加在开头，而其他时候，元素被添加到结尾或中间。在最后一个示例中，现有元素也被重新排序。

知道插入是如何发生的吗？

score 11 · Accepted Answer

11

集合是无序的。元素在集合中的“位置”的概念是未定义的。

于 2012-09-02T20:30:32.147 回答

score 4 · Accepted Answer

4

python中的集合是无序的。顺序是任意的。

于 2012-09-02T20:30:49.373 回答

score 1 · Accepted Answer

集合使用与字典相同的哈希函数来添加元素。事实上，它们只是没有价值元素的 dict。

该视频可以帮助您更好地理解它。

如果您使用整数，则集合是有序的（按照“人类”的排序）：

>>> s=set()
>>> for e in range(10):
...    s.add(e)
...    print s
... 
set([0])
set([0, 1])
set([0, 1, 2])
set([0, 1, 2, 3])
set([0, 1, 2, 3, 4])
set([0, 1, 2, 3, 4, 5])
set([0, 1, 2, 3, 4, 5, 6])
set([0, 1, 2, 3, 4, 5, 6, 7])
set([0, 1, 2, 3, 4, 5, 6, 7, 8])
set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

但是如果你使用一个元组，它们对人眼来说不是“有序的”：

>>> s=set()
>>> for t in ((i,i*i) for i in range(10)):
...    s.add(t)
...    print s
... 
set([(0, 0)])
set([(0, 0), (1, 1)])
set([(0, 0), (1, 1), (2, 4)])
set([(3, 9), (0, 0), (1, 1), (2, 4)])
set([(3, 9), (0, 0), (1, 1), (4, 16), (2, 4)])
set([(0, 0), (4, 16), (5, 25), (3, 9), (2, 4), (1, 1)])
set([(6, 36), (0, 0), (4, 16), (5, 25), (3, 9), (2, 4), (1, 1)])
set([(6, 36), (0, 0), (7, 49), (4, 16), (5, 25), (3, 9), (2, 4), (1, 1)])
set([(6, 36), (0, 0), (7, 49), (4, 16), (5, 25), (3, 9), (2, 4), (1, 1), (8, 64)])
set([(6, 36), (0, 0), (7, 49), (4, 16), (5, 25), (3, 9), (9, 81), (2, 4), (1, 1), (8, 64)])

现在在解释器中尝试这两行：

>>> dict.fromkeys(range(10),None)
{0: None, 1: None, 2: None, 3: None, 4: None, 5: None, 6: None, 7: None, 8: None, 9: None}
>>> dict.fromkeys(((i,i*i) for i in range(10)),None)
{(6, 36): None, (0, 0): None, (7, 49): None, (4, 16): None, (5, 25): None, (3, 9): None, (9, 81): None, (2, 4): None, (1, 1): None, (8, 64): None}

您可以看到生成的 dict 与设置示例的“顺序”相同。

虽然只有 int 键的 dicts 和 set可能是“有序的”，但从实际的角度来看，dicts 和 set 没有顺序。

如果您观看链接的视频，您就会明白为什么。

score 1 · Accepted Answer

元素根据其哈希值进入哈希表中的特定位置。对于具有相同最后 3 位的元素，会发生冲突并为其选择其他位置。哈希表在达到 2/3 满时立即展开以降低冲突率。看这个关于哈希表的视频。

>>> def bits(n):
    n+=2**32
    return bin(n)[-32:]

>>> bits(hash('a'))
'11100100000011011011000111100000' #last three bits are picked to determine the spot in hash table
>>> bits(hash('b'))
'11101011101011101101001101100011'

score 0 · Accepted Answer

正如其他答案所说：

集合应该表现得像没有顺序一样
将事物添加到集合的实际机制与字典相同；集合基本上是没有项目的键的字典
字典基于哈希表（更多信息：字典和哈希表之间的真正区别是什么？）

我认为看看幕后发生的事情可能很有用，所以我评论了 setadd()方法的实际源代码（抱歉滚动）。

def add(self, element):
    """Add an element to a set.

    This has no effect if the element is already present.
    """
    try:
        self._data[element] = True                             # self._data is the dictionary where all of the set elements are stored
    except TypeError:                                          # this try...except block catches the cases when element cannot be a dict key (that is, it isn't hashable)
        transform = getattr(element, "__as_immutable__", None) # the getattr() call is the same as trying to get element.__as_immutable__ and returning None if element doesn't have __as_immutable__
        if transform is None:                                     
            raise # re-raise the TypeError exception we caught # if we get to this line, then the element has no defined way to make it immutable (and thus hashable), so a TypeError is raised
        self._data[transform()] = True                         # so if we get here, transform() is the same as calling element.__as_immutable__() (which we know exists), and we can now add it to the self._data dict

如您所见， setadd(element)与 dict 相同add(element)，只是它会更努力地进行 hash element。

python - Python - 使用 set.add() 时添加的元素在哪里

5 回答 5

Related

Reference