Why doesn't Python math.factorial play well with streams?
Why is math.factorial acting so strange in a flow?
Here's an example, it creates three streams:
- a stream that just sleeps for a while
- a thread that increments int for a while
- a thread that does math.factorial on a lot.
in threads then
Sleep and spin threads work as expected and come back with
immediately and then sit in
for a timeout.
Factorial on the other hand does not come back from
end to end!
import sys from threading import Thread from time import sleep, time from math import factorial # Helper class that stores a start time to compare to class timed_thread(Thread): def __init__(self, time_start): Thread.__init__(self) self.time_start = time_start # Thread that just executes sleep() class sleep_thread(timed_thread): def run(self): sleep(15) print "st DONE:\t%f" % (time() - time_start) # Thread that increments a number for a while class spin_thread(timed_thread): def run(self): x = 1 while x < 120000000: x += 1 print "sp DONE:\t%f" % (time() - time_start) # Thread that calls math.factorial with a large number class factorial_thread(timed_thread): def run(self): factorial(50000) print "ft DONE:\t%f" % (time() - time_start) # the tests print print "sleep_thread test" time_start = time() st = sleep_thread(time_start) st.start() print "st.start:\t%f" % (time() - time_start) st.join(2) print "st.join:\t%f" % (time() - time_start) print "sleep alive:\t%r" % st.isAlive() print print "spin_thread test" time_start = time() sp = spin_thread(time_start) sp.start() print "sp.start:\t%f" % (time() - time_start) sp.join(2) print "sp.join:\t%f" % (time() - time_start) print "sp alive:\t%r" % sp.isAlive() print print "factorial_thread test" time_start = time() ft = factorial_thread(time_start) ft.start() print "ft.start:\t%f" % (time() - time_start) ft.join(2) print "ft.join:\t%f" % (time() - time_start) print "ft alive:\t%r" % ft.isAlive()
And here is the output in Python 2.6.5 on CentOS x64:
sleep_thread test st.start: 0.000675 st.join: 2.006963 sleep alive: True spin_thread test sp.start: 0.000595 sp.join: 2.010066 sp alive: True factorial_thread test ft DONE: 4.475453 ft.start: 4.475589 ft.join: 4.475615 ft alive: False st DONE: 10.994519 sp DONE: 12.054668
I tried this on python 2.6.5 on CentOS x64, 2.7.2 on Windows x86, and the factorial stream does not return from the beginning on either of them until the stream is executed.
I also tried this with PyPy 1.8.0 on Windows x86 and the result is slightly different. The start returns immediately, but then the connection does not expire!
sleep_thread test st.start: 0.001000 st.join: 2.001000 sleep alive: True spin_thread test sp.start: 0.000000 sp DONE: 0.197000 sp.join: 0.236000 sp alive: False factorial_thread test ft.start: 0.032000 ft DONE: 9.011000 ft.join: 9.012000 ft alive: False st DONE: 12.763000
Tried IronPython 2.7.1 too, it produces expected output.
sleep_thread test st.start: 0.023003 st.join: 2.028122 sleep alive: True spin_thread test sp.start: 0.003014 sp.join: 2.003128 sp alive: True factorial_thread test ft.start: 0.002991 ft.join: 2.004105 ft alive: True ft DONE: 5.199295 sp DONE: 5.734322 st DONE: 10.998619
source to share
Threads are often allowed to interleave different things in Python rather than different things that happen at the same time, because of the Global Interpreter Lock .
If you look at the Python bytecode:
from math import factorial def fac_test(x): factorial(x) import dis dis.dis(fac_test)
4 0 LOAD_GLOBAL 0 (factorial) 3 LOAD_FAST 0 (x) 6 CALL_FUNCTION 1 9 POP_TOP 10 LOAD_CONST 0 (None) 13 RETURN_VALUE
As you can see, the call
is one Python bytecode-level operation (
) - it is implemented in C.
does not release the GIL due to the type of work (see comments on my answer), so Python does not switch to other threads while working, and you get the result that you observed.
source to share
Python has a Global Interpreter Lock (GIL), which requires the threads associated with the processor to execute in sequence rather than concurrently. Since the factorial function is written in C and does not release the GIL, even setting is
not sufficient to allow threads to communicate.
provides Process objects, which are similar to threads but operate in different address spaces. For CPU bound tasks, you should seriously consider using a module
source to share