Why don't pre-compiling pyc files give the expected results?

I have a situation that I really cannot explain: I run a Python script on a read-only filesystem with pre-generated files .pyc

for all Python modules, and I run it again after reloading the filesystem. After the second run, the script is much faster.

This didn’t surprise me yet - the pre-generated files .pyc

may not have been generated correctly. In the second run, they will be re-generated and written (located in folders __pycache__

) and reused in the third run.

Note: script I experiment with him, nothing more than the imports of some native Python modules ( sys

, os

, argparse

etc.) - other code related to performance, no. So the only significant difference between each run is the precompiled bytecode (as I can think of). There are no other relevant processes on the machine.

Strange thing : the contents of all folders __pycache__

does not change at all . I copied the whole folder /usr/lib/python3.5

and folder containing my script before and after re-installing and is no different from .

Where does Python store the information it needs to run faster? Why aren't the files .pyc

(or their timestamp!) Changing?

More details :

I am currently trying to pre-compile Python scripts before deploying them to an embedded target using the RO filesystem and the results are not as expected (cannot observe any time difference).

Here's what I do to watch for unwanted behavior:

  • log to target (rootfs installed RO)
  • run script:, time myscript.py

    works for 5.7 sec
  • run script:, time myscript.py

    works for 5.7 sec
  • ... (can be repeated, the result is always the same)
  • scp

    content /usr/lib/python3.5/site-packages

    and folder containing myscript.py

    , to a folder namedbefore

  • update : grab all permissions and timestamps of all files.pyc

  • remount RW: mount -o remount,rw /

  • run script:, time myscript.py

    takes a little more than 5.7 seconds
  • run script:, time myscript.py

    works for 1.79s
  • run script:, time myscript.py

    works for 1.79s
  • ... (can be repeated, the result is always the same)
  • scp

    content /usr/lib/python3.5/site-packages

    and folder containing myscript.py

    , to a folder namedafter

  • update : grab all permissions and timestamps of all files again.pyc

  • compare folders that I suspected now contain modified .pyc

    -files:

    diff -r before after

    no difference

  • update : compare all timestamps and permissions previously received - no difference

I also tried manually modifying one of the copied files .pyc

to check if it actually diff

compares the binaries, but works as expected.

After a reboot, the system also behaves as expected when the cache files were written (the script is fast).

For those who don't believe me:

root@machine:/data# cat myscript.py 
#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import argparse
import logging
import signal
import sys, os, shutil

if __name__ == '__main__':
    print('hello')

root@machine:/data# time ./myscript.py 
hello
real    0m5.756s, user  0m4.326s, sys   0m0.773s

root@machine:/data# time ./myscript.py 
hello
real    0m5.740s, user  0m4.289s, sys   0m0.797s

root@machine:/data# mount -o remount,rw /

root@machine:/data# time ./myscript.py 
hello
real    0m6.005s, user  0m4.374s, sys   0m0.845s

root@machine:/data# time ./myscript.py 
hello
real    0m1.789s, user  0m1.283s, sys   0m0.288s

root@machine:/data# time ./myscript.py 
hello
real    0m1.787s, user  0m1.287s, sys   0m0.284s

      

(I have truncated part of the output, but the numbers are correct)

(very strange) Update:

To keep track of the timestamps for all files .pyc

, I ran the following command before and after restarting myscript.py

:

find /usr/lib/python3.5/site-packages/ -name *.pyc -exec stat -t {} \; > file-details-before/after

      

Result: file-details-before

and are file-details-after

absolutely identical ! The content looks like this (identical in both cases):

...
/usr/lib/python3.5/site-packages/psutil/__pycache__/_psosx.cpython-35.pyc 10676 24 81a4 0 0 e 2221 1 0 0 1499781288 1499555909 1499781288 4096
/usr/lib/python3.5/site-packages/psutil/__pycache__/_psbsd.cpython-35.pyc 12047 24 81a4 0 0 e 2223 1 0 0 1499781288 1499555909 1499781288 4096
/usr/lib/python3.5/site-packages/psutil/__pycache__/_compat.cpython-35.pyc 8393 24 81a4 0 0 e 2222 1 0 0 1499781288 1499555910 1499781288 4096
/usr/lib/python3.5/site-packages/psutil/__pycache__/_psmswindows.cpython-35.pyc 14488 32 81a4 0 0 e 2227 1 0 0 1499781288 1499555910 1499781288 4096
/usr/lib/python3.5/site-packages/psutil/__pycache__/_pslinux.cpython-35.pyc 28030 56 81a4 0 0 e 2224 1 0 0 1499781288 1499555909 1499781288 4096
/usr/lib/python3.5/site-packages/psutil/__pycache__/__init__.cpython-35.pyc 37069 80 81a4 0 0 e 2220 1 0 0 1499781288 1499555910 1499781288 4096
/usr/lib/python3.5/site-packages/psutil/__pycache__/_psposix.cpython-35.pyc 3069 8 81a4 0 0 e 2219 1 0 0 1499781288 1499555909 1499781288 4096
/usr/lib/python3.5/site-packages/psutil/__pycache__/error.cpython-35.pyc 2557 8 81a4 0 0 e 2226 1 0 0 1499781288 1499555910 1499781288 4096
/usr/lib/python3.5/site-packages/psutil/__pycache__/_common.cpython-35.pyc 6188 16 81a4 0 0 e 2225 1 0 0 1499781288 1499555909 1499781288 4096
...

      

The number of files .pyc

, their permissions and timestamps are equal, that is, diff file-details-before file-details-after

it shows no difference.

Profiling:

Before rerunning with a write filesystem:

time python3 -m cProfile -s cumtime ./myscript.py
hello
         11530 function calls (11346 primitive calls) in 2.423 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     19/1    0.020    0.001    2.423    2.423 {built-in method builtins.exec}
        1    0.001    0.001    2.423    2.423 myscript.py:4(<module>)
     21/4    0.004    0.000    2.422    0.605 <frozen importlib._bootstrap>:966(_find_and_load)
     21/4    0.002    0.000    2.418    0.605 <frozen importlib._bootstrap>:939(_find_and_load_unlocked)
     21/4    0.008    0.000    2.369    0.592 <frozen importlib._bootstrap>:659(_load_unlocked)
     16/4    0.003    0.000    2.353    0.588 <frozen importlib._bootstrap_external>:656(exec_module)
     42/8    0.010    0.000    2.319    0.290 <frozen importlib._bootstrap>:214(_call_with_frames_removed)
       16    0.009    0.001    1.815    0.113 <frozen importlib._bootstrap_external>:726(get_code)
       16    0.001    0.000    1.700    0.106 <frozen importlib._bootstrap_external>:718(source_to_code)
       16    1.689    0.106    1.689    0.106 {built-in method builtins.compile}
        1    0.003    0.003    0.687    0.687 __init__.py:24(<module>)
...
real    0m7.611s, user  0m5.861s, sys   0m0.877s

      

and after:

time python3 -m cProfile -s cumtime ./myscript.py
hello
         11050 function calls (10878 primitive calls) in 0.730 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     19/1    0.024    0.001    0.730    0.730 {built-in method builtins.exec}
        1    0.001    0.001    0.730    0.730 myscript.py:4(<module>)
     21/4    0.005    0.000    0.729    0.182 <frozen importlib._bootstrap>:966(_find_and_load)
     21/4    0.002    0.000    0.725    0.181 <frozen importlib._bootstrap>:939(_find_and_load_unlocked)
     21/4    0.005    0.000    0.684    0.171 <frozen importlib._bootstrap>:659(_load_unlocked)
     16/4    0.004    0.000    0.677    0.169 <frozen importlib._bootstrap_external>:656(exec_module)
     26/4    0.001    0.000    0.638    0.160 <frozen importlib._bootstrap>:214(_call_with_frames_removed)
        1    0.003    0.003    0.313    0.313 __init__.py:24(<module>)
        1    0.002    0.002    0.203    0.203 shutil.py:5(<module>)
       21    0.007    0.000    0.201    0.010 <frozen importlib._bootstrap>:879(_find_spec)
       19    0.001    0.000    0.187    0.010 <frozen importlib._bootstrap_external>:1130(find_spec)
...
real    0m3.102s, user  0m2.196s, sys   0m0.448s

      

+3


source to share





All Articles