4

我有点困惑 Python 如何分配内存和垃圾收集,以及它是如何特定于平台的。例如,当我们比较以下两个代码片段时:

片段一:

>>> id('x' * 10000000) == id('x' * 10000000)
True

片段 B:

>>> x = "x"*10000000
>>> y = "x"*10000000
>>> id(x) == id(y)
False

Snippet A 返回 true 是因为当 Python 分配内存时,它在第一次测试中将其分配在相同的位置,而在第二次测试中分配在不同的位置,这就是它们的内存位置不同的原因。

但显然系统性能或平台会影响这一点,因为当我大规模尝试时:

for i in xrange(1, 1000000000):
    if id('x' * i) != id('x' * i):
        print i
        break

Mac 上的一个朋友试过这个,它一直运行到最后。当我在一堆 Linux 虚拟机上运行它时,它总是会在不同的虚拟机上返回(但在不同的时间)。这是因为 Python 中的垃圾收集调度吗?是因为我的 Linux VM 的处理速度低于 Mac,还是因为 Linux Python 实现的垃圾收集方式不同?

4

3 回答 3

6

The garbage collector just uses whatever space is convenient. There are lots of different garbage collection strategies, and things are also affected by paramters, different platforms, memory usage, phase of the moon etc. Trying to guess how the interpreter will happen to allocate particular objects is just a waste of time.

于 2012-10-30T19:45:04.380 回答
5

发生这种情况是因为 python 缓存了小整数和字符串:

大字符串:存储在未缓存的变量中:

In [32]: x = "x"*10000000

In [33]: y = "x"*10000000

In [34]: x is y
Out[34]: False

大字符串:不存储在变量中,看起来像缓存:

In [35]: id('x' * 10000000) == id('x' * 10000000)
Out[35]: True

小字符串:缓存

In [36]: x="abcd"

In [37]: y="abcd"

In [38]: x is y
Out[38]: True

小整数:缓存

In [39]: x=3

In [40]: y=3

In [41]: x is y
Out[41]: True

大整数:

存储在变量中:未缓存

In [49]: x=12345678

In [50]: y=12345678

In [51]: x is y
Out[51]: False

未存储:缓存

In [52]: id(12345678)==id(12345678)
Out[52]: True
于 2012-10-30T19:50:41.000 回答
3

CPython uses two strategies for memory management:

  1. Reference Counting
  2. Mark-and-Sweep Garbage Collection

Allocation is in general done via the platforms malloc/free functions and inherits the performance characteristics of the underlaying runtime. If memory is reused is decided by the operating system. (There are some objects, which are pooled by the python vm)

Your example does, however, not trigger the 'real' GC algorithm (this is only used to collect cycles). Your long string gets deallocated as soon as the last reference is dropped.

于 2012-10-30T19:50:16.400 回答