How can I combine multiple Python source files into one file?

(Suppose: Application startup time is absolutely critical, my application is running a lot, my application is running in an environment where imports are slower than usual, many files need to be imported, and compilation is .pyc

not available.)

I would like to combine all Python source files that define a set of modules into one new Python source file.

I would like the result of importing the new file to be as if I were importing one of the original files (which would then import a few more original files, etc.).

Is it possible?

Here is a rough manual simulation of what the tool can produce by feeding the source files for the "bar" and "baz" modules. You must run such a tool before deploying the code.

__file__ = 'foo.py'

def _module(_name):
    import types
    mod = types.ModuleType(name)
    mod.__file__ = __file__
    sys.modules[module_name] = mod
    return mod

def _bar_module():

    def hello():
        print 'Hello World! BAR'

    mod = create_module('foo.bar')
    mod.hello = hello
    return mod

bar = _bar_module()
del _bar_module

def _baz_module():

    def hello():
        print 'Hello World! BAZ'

    mod = create_module('foo.bar.baz')
    mod.hello = hello
    return mod

baz = _baz_module()
del _baz_module

      

And now you can:

from foo.bar import hello
hello()

      

This code ignores things like imports and dependencies. Is there any existing code that will build source files using this or some other method?

This is a very similar idea to the tools used to build and optimize JavaScript files before being sent to the browser, where delaying multiple HTTP requests hurts performance. In this case, Python is the delay in importing hundreds of Python source files at startup, which hurts.

+2


source to share


4 answers


If it is specified in the google engine as the tags indicate, make sure you use this idiom

def main(): 
    #do stuff
if __name__ == '__main__':
    main()

      

Since GAE does not restart the application every request, unless the .py has changed, it starts again main()

.



This trick allows you to write CGI style applications without boosting startup performance

AppCaching

If the script handler provides the main () subroutine, the runtime also caches the script. Otherwise, a script handler is loaded for each request.

+3


source


I think that due to pre-compilation of Python files and some system caching, the speed you end up with won't be measurable.



+1


source


Doing so is unlikely to provide any performance benefit. You're still importing the same amount of Python code, in just fewer modules - and you're sacrificing all modularity for it.

A better approach would be to change your code and / or libraries only when you need to import things so that the minimum required code is loaded for each request.

0


source


Aside from the question whether this technique will improve the efficiency of your environment, tell me you are right, this is what I would do.

I would make a list of all my modules, eg. my_files = ['foo', 'bar', 'baz']

Then I used the os.path utilities to read all lines in all files in the source directory and write them all to a new file, filtering all lines import foo|bar|baz

as all the code is now in one file.

The curse is finally adding main()

from __init__.py

(if any) to the tail of the file.

0


source







All Articles