Linux - run multiple parallel commands, sequential print output
I'm a bit new to bash and I need to execute a short command several hundred times in parallel, but print the output sequentially. The command prints pretty short output to stdout, which I don't want to lose, or for it to be garbled / mixed with the output of another stream. Is there a way in Linux to run multiple commands (e.g. no more than N threads in parallel) so that all output commands are printed sequentially (in any order, as long as they don't overlap)
Current bash script (full code here )
declare -a UPDATE_ERRORS
UPDATE_ERRORS=( )
function pull {
git pull # Assumes current dir is set
if [[ $? -ne 0 ]]; then
UPDATE_ERRORS+=("error message")
fi
for f in extensions/*; do
if [[ -d $f ]]; then
########## This code should run in parallel, but output of each thread
########## should be cached and printed sequentially one after another
########## pull function also updates a global var that will be used later
pushd $f > /dev/null
pull
popd > /dev/null
fi
done
if [[ ${#UPDATE_ERRORS[@]} -ne 0 ]]; then
# print errors again
fi
source to share
You can use flock
for this. I mimic a similar situation for verification. do_the_things
proc creates an overlap in time. When generating text for a loop, the loop is called several times at the same time. The output should be useless, but the output is fed into a routine locked_print
that waits until the lock is released and then prints the resulting input to standard output. Export is required to call a procedure from within a channel.
#!/bin/bash
do_the_things()
{
rand="$((RANDOM % 10))"
sleep $rand
for i in `seq 1 10`; do sleep 1; echo "${rand}-$i"; done
}
locked_print()
{
echo Started
flock -e testlock cat
}
export -f do_the_things
export -f locked_print
for f in a b c d; do
(do_the_things | locked_print) &
done
wait
source to share
Try something like this. I don't have / use git
, so I made a dummy command to simulate it in my version.
#!/bin/bash
declare -a ERRORS
ERRORS=( )
function pull {
cd "$1"
echo Starting pull in $1
for i in {0..9}; do echo "$1 Output line $i";done
sleep 5
echo "GITERROR: Dummy error in directory $1"
}
export -f pull
for f in extensions/*; do
if [[ -d $f ]]; then
########## This code should run in parallel, but output of each thread
########## should be cached and printed sequentially one after another
########## pull function also updates a global var that will be used later
echo $f
fi
done | parallel -k pull | tee errors.tmp
IFS=$'\n' ERRORS=($(grep "^GITERROR:" errors.tmp))
rm errors.tmp
for i in "${ERRORS[@]}"; do
echo $i
done
You will see that even if there are 4 directories, it only takes 5 seconds for the whole script - despite executing 4 lots sleep 5
.
source to share
Enter the list by adding /. Spawns a shell in parallel, which is written to the directory. If git pull failed to print magic line. All production is also stored as copies in the / 1 / * output. When all the pulls are complete, check in which files the magic lines occur and print the STDOUT / STDERR of those commands. Cleanup.
parallel --results out 'cd {} && (git pull || echo e_R_r_O_r)' ::: extensions/*/
grep -l e_R_r_O_r out/*/stdout | parallel 'grep -v e_R_r_O_r {//}/stdout; cat {//}/stderr >&2'
rm -r out
source to share