1

正如您之前可能已经注意到的,CPython 有时会存储相同不可变对象的单个副本。

例如

>>> a = "hello"
>>> b = "hello"
>>> a is b
True

>>> a, b = 7734, 7734
>>> a is b
True

看来我假设是堆的散列是在类型推断之后执行的

>>> a, b = 7734, 07734
>>> a is b
False

>>> a, b = 7734, 017066
>>> a is b
True

有没有办法自省解释器并打印出这个假定的不可变对象堆?

4

2 回答 2

8

不,实习对象保存在一系列位置,没有一种方法可以将它们全部列出。

  • 正如您所发现的,可以对字符串进行实习,并且您可以使用intern()函数自己实习字符串。
  • -5 到 256 之间的小整数被保留。
  • 元组被重用;空元组 ( ()) 是一个单例,从 1 到 20 的每个元组都有 2000 个被缓存以供回收利用。(只是元组对象,而不是内容)。
  • None是单例,如Ellipsis,和.NotImplementedTrueFalse
  • 从 Python 3.3 开始,实例__dict__字典可以共享键以节省内存。
  • 编译器可以将不可变(在某些情况下,可变)源代码文字标记为常量,将它们与字节码一起存储,并在每次运行字节码时重新使用它们。这适用于字符串、数字、元组、列表(如果与in语句一起使用)和Python 3.2 的集合(同样,当与 一起使用时in)。

可能还有更多我还没有发现。

These optimizations all help to avoid too much heap churn. And apart from None, Ellipsis, NotImplemented, True and False being a singletons they are all CPython-specific optimisations, they are not part of the Python language definition itself.

于 2013-09-04T19:44:01.983 回答
3

It's a little more complicated than you make it out to be. For instance, in your examples with large integers, the same object is not reused when the uses aren't part of the same expression.

>>> a = 7734
>>> b = 7734
>>> a is b
False

On the other hand, as your first example shows, this does work with strings...but not all strings.

>>> a = "this string includes spaces"
>>> b = "this string includes spaces"
>>> a is b
False

The following objects are actually interned by default: small integers, the empty tuple, and strings that look like Python identifiers. What you're seeing with large integers and other immutable objects is an optimization due to the fact that they're being used in the same expression.

于 2013-09-04T19:49:51.110 回答