Why don't pre-compiling pyc files give the expected results?
I have a situation that I really cannot explain: I run a Python script on a read-only filesystem with pre-generated files .pyc
for all Python modules, and I run it again after reloading the filesystem. After the second run, the script is much faster.
This didn’t surprise me yet - the pre-generated files .pyc
may not have been generated correctly. In the second run, they will be re-generated and written (located in folders __pycache__
) and reused in the third run.
Note:
script I experiment with him, nothing more than the imports of some native Python modules ( sys
, os
, argparse
etc.) - other code related to performance, no. So the only significant difference between each run is the precompiled bytecode (as I can think of). There are no other relevant processes on the machine.
Strange thing : the contents of all folders __pycache__
does not change at all . I copied the whole folder /usr/lib/python3.5
and folder containing my script before and after re-installing and is no different from .
Where does Python store the information it needs to run faster? Why aren't the files .pyc
(or their timestamp!) Changing?
More details :
I am currently trying to pre-compile Python scripts before deploying them to an embedded target using the RO filesystem and the results are not as expected (cannot observe any time difference).
Here's what I do to watch for unwanted behavior:
- log to target (rootfs installed RO)
- run script:,
time myscript.py
works for 5.7 sec - run script:,
time myscript.py
works for 5.7 sec - ... (can be repeated, the result is always the same)
-
scp
content/usr/lib/python3.5/site-packages
and folder containingmyscript.py
, to a folder namedbefore
- update : grab all permissions and timestamps of all files
.pyc
- remount RW:
mount -o remount,rw /
- run script:,
time myscript.py
takes a little more than 5.7 seconds - run script:,
time myscript.py
works for 1.79s - run script:,
time myscript.py
works for 1.79s - ... (can be repeated, the result is always the same)
-
scp
content/usr/lib/python3.5/site-packages
and folder containingmyscript.py
, to a folder namedafter
- update : grab all permissions and timestamps of all files again
.pyc
-
compare folders that I suspected now contain modified
.pyc
-files:diff -r before after
no difference
- update : compare all timestamps and permissions previously received - no difference
I also tried manually modifying one of the copied files .pyc
to check if it actually diff
compares the binaries, but works as expected.
After a reboot, the system also behaves as expected when the cache files were written (the script is fast).
For those who don't believe me:
root@machine:/data# cat myscript.py
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import argparse
import logging
import signal
import sys, os, shutil
if __name__ == '__main__':
print('hello')
root@machine:/data# time ./myscript.py
hello
real 0m5.756s, user 0m4.326s, sys 0m0.773s
root@machine:/data# time ./myscript.py
hello
real 0m5.740s, user 0m4.289s, sys 0m0.797s
root@machine:/data# mount -o remount,rw /
root@machine:/data# time ./myscript.py
hello
real 0m6.005s, user 0m4.374s, sys 0m0.845s
root@machine:/data# time ./myscript.py
hello
real 0m1.789s, user 0m1.283s, sys 0m0.288s
root@machine:/data# time ./myscript.py
hello
real 0m1.787s, user 0m1.287s, sys 0m0.284s
(I have truncated part of the output, but the numbers are correct)
(very strange) Update:
To keep track of the timestamps for all files .pyc
, I ran the following command before and after restarting myscript.py
:
find /usr/lib/python3.5/site-packages/ -name *.pyc -exec stat -t {} \; > file-details-before/after
Result: file-details-before
and are file-details-after
absolutely identical ! The content looks like this (identical in both cases):
...
/usr/lib/python3.5/site-packages/psutil/__pycache__/_psosx.cpython-35.pyc 10676 24 81a4 0 0 e 2221 1 0 0 1499781288 1499555909 1499781288 4096
/usr/lib/python3.5/site-packages/psutil/__pycache__/_psbsd.cpython-35.pyc 12047 24 81a4 0 0 e 2223 1 0 0 1499781288 1499555909 1499781288 4096
/usr/lib/python3.5/site-packages/psutil/__pycache__/_compat.cpython-35.pyc 8393 24 81a4 0 0 e 2222 1 0 0 1499781288 1499555910 1499781288 4096
/usr/lib/python3.5/site-packages/psutil/__pycache__/_psmswindows.cpython-35.pyc 14488 32 81a4 0 0 e 2227 1 0 0 1499781288 1499555910 1499781288 4096
/usr/lib/python3.5/site-packages/psutil/__pycache__/_pslinux.cpython-35.pyc 28030 56 81a4 0 0 e 2224 1 0 0 1499781288 1499555909 1499781288 4096
/usr/lib/python3.5/site-packages/psutil/__pycache__/__init__.cpython-35.pyc 37069 80 81a4 0 0 e 2220 1 0 0 1499781288 1499555910 1499781288 4096
/usr/lib/python3.5/site-packages/psutil/__pycache__/_psposix.cpython-35.pyc 3069 8 81a4 0 0 e 2219 1 0 0 1499781288 1499555909 1499781288 4096
/usr/lib/python3.5/site-packages/psutil/__pycache__/error.cpython-35.pyc 2557 8 81a4 0 0 e 2226 1 0 0 1499781288 1499555910 1499781288 4096
/usr/lib/python3.5/site-packages/psutil/__pycache__/_common.cpython-35.pyc 6188 16 81a4 0 0 e 2225 1 0 0 1499781288 1499555909 1499781288 4096
...
The number of files .pyc
, their permissions and timestamps are equal, that is, diff file-details-before file-details-after
it shows no difference.
Profiling:
Before rerunning with a write filesystem:
time python3 -m cProfile -s cumtime ./myscript.py
hello
11530 function calls (11346 primitive calls) in 2.423 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
19/1 0.020 0.001 2.423 2.423 {built-in method builtins.exec}
1 0.001 0.001 2.423 2.423 myscript.py:4(<module>)
21/4 0.004 0.000 2.422 0.605 <frozen importlib._bootstrap>:966(_find_and_load)
21/4 0.002 0.000 2.418 0.605 <frozen importlib._bootstrap>:939(_find_and_load_unlocked)
21/4 0.008 0.000 2.369 0.592 <frozen importlib._bootstrap>:659(_load_unlocked)
16/4 0.003 0.000 2.353 0.588 <frozen importlib._bootstrap_external>:656(exec_module)
42/8 0.010 0.000 2.319 0.290 <frozen importlib._bootstrap>:214(_call_with_frames_removed)
16 0.009 0.001 1.815 0.113 <frozen importlib._bootstrap_external>:726(get_code)
16 0.001 0.000 1.700 0.106 <frozen importlib._bootstrap_external>:718(source_to_code)
16 1.689 0.106 1.689 0.106 {built-in method builtins.compile}
1 0.003 0.003 0.687 0.687 __init__.py:24(<module>)
...
real 0m7.611s, user 0m5.861s, sys 0m0.877s
and after:
time python3 -m cProfile -s cumtime ./myscript.py
hello
11050 function calls (10878 primitive calls) in 0.730 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
19/1 0.024 0.001 0.730 0.730 {built-in method builtins.exec}
1 0.001 0.001 0.730 0.730 myscript.py:4(<module>)
21/4 0.005 0.000 0.729 0.182 <frozen importlib._bootstrap>:966(_find_and_load)
21/4 0.002 0.000 0.725 0.181 <frozen importlib._bootstrap>:939(_find_and_load_unlocked)
21/4 0.005 0.000 0.684 0.171 <frozen importlib._bootstrap>:659(_load_unlocked)
16/4 0.004 0.000 0.677 0.169 <frozen importlib._bootstrap_external>:656(exec_module)
26/4 0.001 0.000 0.638 0.160 <frozen importlib._bootstrap>:214(_call_with_frames_removed)
1 0.003 0.003 0.313 0.313 __init__.py:24(<module>)
1 0.002 0.002 0.203 0.203 shutil.py:5(<module>)
21 0.007 0.000 0.201 0.010 <frozen importlib._bootstrap>:879(_find_spec)
19 0.001 0.000 0.187 0.010 <frozen importlib._bootstrap_external>:1130(find_spec)
...
real 0m3.102s, user 0m2.196s, sys 0m0.448s
source to share
No one has answered this question yet
Check out similar questions: