How to prevent a race condition where multiple processes try to write and then read from a file at the same time

I have the following code (simplified for clarity):

import os
import errno
import imp


lib_dir = os.path.expanduser('~/.brian/cython_extensions')
module_name = '_cython_magic_5'
module_path = os.path.join(lib_dir, module_name + '.so')
code = 'some code'

have_module = os.path.isfile(module_path)
if not have_module:
    pyx_file = os.path.join(lib_dir, module_name + '.pyx')

    # THIS IS WHERE EACH PROCESS TRIES TO WRITE TO THE FILE.  THE CODE HERE 
    # PREVENTS A RACE CONDITION.
    try:
        fd = os.open(pyx_file, os.O_CREAT | os.O_EXCL | os.O_WRONLY)
    except OSError as e:
        if e.errno == errno.EEXIST:
            pass
        else:
            raise
    else:
        os.fdopen(fd, 'w').write(code)

# THIS IS WHERE EACH PROCESS TRIES TO READ FROM THE FILE.  CURRENTLY THERE IS A
# RACE CONDITION.
module = imp.load_dynamic(module_name, module_path)

      

(Some of the above code is borrowed from this answer .)

When multiple processes are started at once, this code only calls one, which opens and writes to pyx_file

(assuming it pyx_file

doesn't exist yet). The problem is that as this process is written to pyx_file

, other processes try to load pyx_file

- errors occur in the latter processes, because at the time they read pyx_file

it is incomplete. (Specifically ImportError

raised because processes are trying to import the contents of the file.)

What's the best way to avoid these mistakes? One idea is for processes to keep trying to import pyx_file

in the while loop until the import is successful. (This solution seems suboptimal.)

+3


source to share


2 answers


The way to do this is to make an exclusive lock every time you open it. The writer holds the lock while writing data, while the reader locks until the author releases the lock by calling fdclose. This will, of course, fail if the file was partially written and the writing process fails, so a suitable error for deleting the file should be displayed if the module cannot be loaded:



import os
import fcntl as F

def load_module():
    pyx_file = os.path.join(lib_dir, module_name + '.pyx')

    try:
        # Try and create/open the file only if it doesn't exist.
        fd = os.open(pyx_file, os.O_CREAT | os.O_EXCL | os.O_WRONLY):

        # Lock the file exclusively to notify other processes we're writing still.
        F.flock(fd, F.LOCK_EX)
        with os.fdopen(fd, 'w') as f:
            f.write(code)

    except OSError as e:
        # If the error wasn't EEXIST we should raise it.
        if e.errno != errno.EEXIST:
            raise

    # The file existed, so let open it for reading and then try and
    # lock it. This will block on the LOCK_EX above if it held by
    # the writing process.
    with file(pyx_file, "r") as f:
        F.flock(f, F.LOCK_EX)

    return imp.load_dynamic(module_name, module_path)

module = load_module()

      

+5


source


Use an empty file PID to lock each time the file is accessed.

Usage example:

from mercurial import error, lock

try:
    l = lock.lock("/tmp/{0}.lock".format(FILENAME), timeout=600) # wait at most 10 minutes
    # do something
except error.LockHeld:
     # couldn't take the lock
else:
    l.release()

      



source: Python: module to create PID based lock file?

This will give you a general idea. This method is used in OO, vim and other applications.

-1


source







All Articles