5

my code is

import threading

counter = 0

def worker():
    global counter
    counter += 1

if __name__ == "__main__":
    threads = []
    for i in range(1000):
        t = threading.Thread(target = worker)
        threads.append(t)
        t.start()
    for t in threads:
        t.join()

    print counter

because I don't use lock to protect the shared resource,i.e. counter variable, I expect that the result is a number less than 1000, but the counter is always 1000, I don't know why. Does counter += 1 is a atomic operation in Python?

what operations in Python are atomic using GIL?

4

1 回答 1

8

Don't count on x += 1 being thread-safe. Here is an example where it does not work (see Josiah Carlson's comment):

import threading
x = 0
def foo():
    global x
    for i in xrange(1000000):
        x += 1
threads = [threading.Thread(target=foo), threading.Thread(target=foo)]
for t in threads:
    t.daemon = True
    t.start()
for t in threads:
    t.join()
print(x)

If you disassemble foo:

In [80]: import dis

In [81]: dis.dis(foo)
  4           0 SETUP_LOOP              30 (to 33)
              3 LOAD_GLOBAL              0 (xrange)
              6 LOAD_CONST               1 (1000000)
              9 CALL_FUNCTION            1
             12 GET_ITER            
        >>   13 FOR_ITER                16 (to 32)
             16 STORE_FAST               0 (i)

  5          19 LOAD_GLOBAL              1 (x)
             22 LOAD_CONST               2 (1)
             25 INPLACE_ADD         
             26 STORE_GLOBAL             1 (x)
             29 JUMP_ABSOLUTE           13
        >>   32 POP_BLOCK           
        >>   33 LOAD_CONST               0 (None)
             36 RETURN_VALUE        

You see that there is a LOAD_GLOBAL to retrieve the value of x, there is an INPLACE_ADD, and then a STORE_GLOBAL.

If both threads LOAD_GLOBAL in succession, then they might both load the same value of x. Then they both increment to the same number, and store the same number. So the work of one thread overwrites the work of the other. This is not thread-safe.

As you can see, the final value of x would be 2000000 if the program were thread-safe, but instead you almost always get a number less than 2000000.


If you add a lock, you get the "expected" answer:

import threading
lock = threading.Lock()
x = 0
def foo():
    global x
    for i in xrange(1000000):
        with lock:
            x += 1
threads = [threading.Thread(target=foo), threading.Thread(target=foo)]
for t in threads:
    t.daemon = True
    t.start()
for t in threads:
    t.join()
print(x)

yields

2000000

I think the reason why the code you posted does not exhibit a problem:

for i in range(1000):
    t = threading.Thread(target = worker)
    threads.append(t)
    t.start()

is because your workers complete so darn quickly compared to the time it takes to spawn a new thread that in practice there is no competition between threads. In Josiah Carlson's example above, each thread spends a significant amount of time in foo which increases the chance of thread collision.

于 2013-02-25T02:01:12.310 回答