My understanding was that concurrent.futures relied on pickling arguments to get them running in different processes (or threads). Shouldn't pickling create a copy of the argument? On Linux it does not seem to be doing so, i.e., I have to explicitly pass a copy.
I'm trying to make sense of the following results:
<0> rands before submission: [17, 72, 97, 8, 32, 15, 63, 97, 57, 60]
<1> rands before submission: [97, 15, 97, 32, 60, 17, 57, 72, 8, 63]
<2> rands before submission: [15, 57, 63, 17, 97, 97, 8, 32, 60, 72]
<3> rands before submission: [32, 97, 63, 72, 17, 57, 97, 8, 15, 60]
in function 0 [97, 15, 97, 32, 60, 17, 57, 72, 8, 63]
in function 1 [97, 32, 17, 15, 57, 97, 63, 72, 60, 8]
in function 2 [97, 32, 17, 15, 57, 97, 63, 72, 60, 8]
in function 3 [97, 32, 17, 15, 57, 97, 63, 72, 60, 8]
Here's the code:
from __future__ import print_function
import time
import random
try:
from concurrent import futures
except ImportError:
import futures
def work_with_rands(i, rands):
print('in function', i, rands)
def main():
random.seed(1)
rands = [random.randrange(100) for _ in range(10)]
# sequence 1 and sequence 2 should give the same results but they don't
# only difference is that one uses a copy of rands (i.e., rands.copy())
# sequence 1
with futures.ProcessPoolExecutor() as ex:
for i in range(4):
print("<{}> rands before submission: {}".format(i, rands))
ex.submit(work_with_rands, i, rands)
random.shuffle(rands)
print('-' * 30)
random.seed(1)
rands = [random.randrange(100) for _ in range(10)]
# sequence 2
print("initial sequence: ", rands)
with futures.ProcessPoolExecutor() as ex:
for i in range(4):
print("<{}> rands before submission: {}".format(i, rands))
ex.submit(work_with_rands, i, rands[:])
random.shuffle(rands)
if __name__ == "__main__":
main()
Where on earth is [97, 32, 17, 15, 57, 97, 63, 72, 60, 8]
coming from? That's not even one of the sequences passed to submit
.
The results differ slightly under Python 2.