5

Some discussion in another question has encouraged me to to better understand cases where locking is required in multithreaded Python programs.

Per this article on threading in Python, I have several solid, testable examples of pitfalls that can occur when multiple threads access shared state. The example race condition provided on this page involves races between threads reading and manipulating a shared variable stored in a dictionary. I think the case for a race here is very obvious, and fortunately is eminently testable.

However, I have been unable to evoke a race condition with atomic operations such as list appends or variable increments. This test exhaustively attempts to demonstrate such a race:

from threading import Thread, Lock
import operator

def contains_all_ints(l, n):
    l.sort()
    for i in xrange(0, n):
        if l[i] != i:
            return False
    return True

def test(ntests):
    results = []
    threads = []
    def lockless_append(i):
        results.append(i)
    for i in xrange(0, ntests):
        threads.append(Thread(target=lockless_append, args=(i,)))
        threads[i].start()
    for i in xrange(0, ntests):
        threads[i].join()
    if len(results) != ntests or not contains_all_ints(results, ntests):
        return False
    else:
        return True

for i in range(0,100):
    if test(100000):
        print "OK", i
    else:
        print "appending to a list without locks *is* unsafe"
        exit()

I have run the test above without failure (100x 100k multithreaded appends). Can anyone get it to fail? Is there another class of object which can be made to misbehave via atomic, incremental, modification by threads?

Do these implicitly 'atomic' semantics apply to other operations in Python? Is this directly related to the GIL?

4

2 回答 2

7

附加到列表是线程安全的,是的。您只能在持有 GIL 的同时追加到列表,并且列表注意在append操作期间不要释放 GIL(这毕竟是一个相当简单的操作。)不同线程的追加操作经过的顺序是当然可以抢,但它们都将是严格的序列化操作,因为 GIL 在追加期间永远不会释放。

其他操作不一定如此。Python 中的大量操作会导致任意 Python 代码被执行,进而导致 GIL 被释放。例如,i += 1是三个不同的操作,“get i'、“add 1 to it”和“store it in i”。“add 1 to it”将(在这种情况下)转换为it.__iadd__(1),它可以执行任何它喜欢的操作。

Python 对象本身保护着它们自己的内部状态——字典不会被试图在其中设置项目的两个不同线程破坏。但是,如果 dict 中的数据应该是内部一致的,那么 dict 和 GIL 都不会做任何事情来保护它,除非(以通常的线程方式)通过降低可能性但仍然可能导致事情最终与您想象的不同。

于 2010-04-29T20:17:03.720 回答
1

在 CPython 中,线程切换是在 sys.getcheckinteval() bycodes 被执行时完成的。因此,在单个字节码的执行过程中永远不会发生上下文切换,并且编码为单个字节码的操作本质上是原子和线程安全的,除非该字节码执行其他 Python 代码或调用释放 GIL 的 C 代码。内置集合类型(dict、list 等)上的大多数操作都属于“固有线程安全”类别。

但是,这是特定于 Python 的 C 实现的实现细节,不应依赖。其他版本的 Python(Jython、IronPython、PyPy 等)的行为方式可能不同。也不能保证未来版本的 CPython 会保持这种行为。

于 2010-04-29T20:34:41.420 回答