I'm learning python multiprocessing module and I've found this example (this is a bit modified version):
#!/bin/env python
import multiprocessing as mp
import random
import string
import time
# Define an output queue
output = mp.Queue()
# define a example function
def rand_string(length, output):
time.sleep(1)
""" Generates a random string of numbers, lower- and uppercase chars. """
rand_str = ''.join(random.choice(
string.ascii_lowercase
+ string.ascii_uppercase
+ string.digits)
for i in range(length))
result = (len(rand_str), rand_str)
print result
time.sleep(1)
output.put(result)
def queue_size(queue):
size = int(queue.qsize())
print size
# Setup a list of processes that we want to run
processes = [mp.Process(target=rand_string, args=(x, output)) for x in range(1,10)]
# Run processes
for p in processes:
p.start()
# Exit the completed processes
for p in processes:
p.join()
# Get process results from the output queue
results = [output.get() for p in processes]
print(results)
The output of this is following:
(3, 'amF')
(1, 'c')
(6, '714CUg')
(4, '10Qg')
(5, 'Yns6h')
(7, 'wsSXj3Z')
(9, 'KRcDTtVZA')
(2, 'Qy')
(8, '50LpMzG9')
[(3, 'amF'), (1, 'c'), (6, '714CUg'), (4, '10Qg'), (5, 'Yns6h'), (9, 'KRcDTtVZA'), (2, 'Qy'), (7, 'wsSXj3Z'), (8, '50LpMzG9')]
I understand that processes are not called in order which they was created (using processes = [mp.Process(target=rand_string, args=(x, output)) for x in range(1,10)]
) this is mentioned in referred article. What I do not understand (or I'm not sure if understand correct) is why the order of result
does not corresponds with the order in which print outputs the result
to STDOUT? My understanding of this is that those three operations are not atomic (I mean that they can be separated by process switch):
print result
time.sleep(1)
output.put(result)
Basically what happens here is that in the moment when process print
the results
to STDOUT it is switched to another process which writes to results
. Something like that:
Time
------------------------------------------------------------------------------------------------------------------>
Process1: print results | | | time.sleep(1) | output.put(result) |
Process2: | print results | time.sleep(1) | output.put(result) | | |
In this case the output on STDOUT would be:
(1, 'c')
(2, 's5')
But the actual content of results
will be:
[ (2, 's5') (1, 'c')]
And for the same reason the processes are not stared in order as they ware created.
Am I right?