How to use parallel Python modules like multiprocessing. Sun SGE card

I have a piece of python code that runs on the same machine multiprocessing.Pool

for many independent jobs. I wonder if it is possible to make it even more parallel on the SGE grid, for example, each node of the grid runs multiple threads for these independent jobs.

Initially I have:

# function def
# some_function(param1, param2, param3, process_index)    
func = functools.partial(some_function, file_list, param1, param2, param3)
pool = multiprocessing.Pool(processes=some_integer)
ret_list = pool.map(func, range(processes))
pool.close()

      

It seems to work fine on the local machine, but if presented on the SGE grid as it is, it does crash without spitting out an error message. The send command might look like this:

qsub -V -b yes -cwd -l h_vmem=10G -N jobname -o grid_job.log -j yes "python worker.py"

      

Ideally, I'm looking for minimal changes to the local version of the python code so that it can be run on the SGE grid because it was difficult to install new tools on the grid or change any grid configurations without affecting other users.

At the very least, I understand that you can simply rewrite the code in such a way that the processing of each of the jobs (file c file_list

) is processed by one qsub command. But I'm wondering what the best practice is.

+3


source to share


1 answer


What I would do is make the Python script read a list of files and the number of processes as command line arguments. So it's easier to call it. I would write a Bash script that takes a list of files as arguments and dispatches all jobs depending on what you want to do. This way you can do two levels of parallelization: on multiple nodes (qsub) and multiple processes on node (python multiprocessor). To do this correctly, you need to tell qsub the number of SLOTS you want for each job. This is done by sending to a parallel environment and specifying the SLOT ( -pe ENV_NAME NBSLOTS

) number:

#!/bin/bash

NB_PROCESS_PER_JOB=2
NB_FILE_PER_JOB=3
CPT=0
BUF=""
NUMJOB=1

for i in "$@"; do
    BUF="$BUF '$i'"
    ((CPT++))
    if ((CPT == NB_FILE_PER_JOB)); then
        echo qsub -pe multithread $CPT -V -b yes -cwd -l h_vmem=10G -N jobname$NUMJOB -o grid_job.log -j yes "python worker.py $NB_PROCESS_PER_JOB $BUF"
        BUF=""
        CPT=0
        ((NUMJOB++))
    fi
done
if [[ "$BUF" != "" ]]; then
    echo qsub -pe multithread $CPT -V -b yes -cwd -l h_vmem=10G -N jobname$NUMJOB -o grid_job.log -j yes "python worker.py $NB_PROCESS_PER_JOB $BUF"
fi

      

The Python script will look like this:



import sys

nb_processes = int(sys.argv[1])
file_list = sys.argv[2:]

pool = multiprocessing.Pool(processes=nb_processes)
ret_list = pool.map(some_function, file_list)
pool.close()

      

If your SGE cluster doesn't have any kind of parallel environment, then I suggest you don't parallelize the Python script (delete -pe ENV_NAME NBSLOTS

) and don't use the pool in the Python script or force it to only produce one process). A simple SGE job doesn't need to be multithreaded. If a simple job is multithreaded, it uses an unreserved resource and can slow down other users.

+4


source







All Articles