Linux - run multiple parallel commands, sequential print output

I'm a bit new to bash and I need to execute a short command several hundred times in parallel, but print the output sequentially. The command prints pretty short output to stdout, which I don't want to lose, or for it to be garbled / mixed with the output of another stream. Is there a way in Linux to run multiple commands (e.g. no more than N threads in parallel) so that all output commands are printed sequentially (in any order, as long as they don't overlap)

Current bash script (full code here )

declare -a UPDATE_ERRORS
UPDATE_ERRORS=( )

function pull {
    git pull  # Assumes current dir is set
    if [[ $? -ne 0 ]]; then
      UPDATE_ERRORS+=("error message")
    fi

for f in extensions/*; do
  if [[ -d $f ]]; then
    ########## This code should run in parallel, but output of each thread
    ########## should be cached and printed sequentially one after another
    ########## pull function also updates a global var that will be used later
    pushd $f > /dev/null
    pull
    popd > /dev/null
  fi
done

if [[ ${#UPDATE_ERRORS[@]} -ne 0 ]]; then
  # print errors again
fi

      

+3


source to share


3 answers


You can use flock

for this. I mimic a similar situation for verification. do_the_things

proc creates an overlap in time. When generating text for a loop, the loop is called several times at the same time. The output should be useless, but the output is fed into a routine locked_print

that waits until the lock is released and then prints the resulting input to standard output. Export is required to call a procedure from within a channel.



#!/bin/bash

do_the_things()
        {
        rand="$((RANDOM % 10))"
        sleep $rand
        for i in `seq 1 10`; do sleep 1; echo "${rand}-$i"; done
        }

locked_print()
        {
        echo Started
        flock -e testlock cat
        }

export -f do_the_things
export -f locked_print

for f in a b c d; do
        (do_the_things | locked_print) &
done
wait

      

+3


source


Try something like this. I don't have / use git

, so I made a dummy command to simulate it in my version.

#!/bin/bash
declare -a ERRORS
ERRORS=( )

function pull {
    cd "$1"
    echo Starting pull in $1
    for i in {0..9}; do echo "$1 Output line $i";done
    sleep 5
    echo "GITERROR: Dummy error in directory $1"
}

export -f pull

for f in extensions/*; do
  if [[ -d $f ]]; then
    ########## This code should run in parallel, but output of each thread
    ########## should be cached and printed sequentially one after another
    ########## pull function also updates a global var that will be used later
    echo $f
  fi
done | parallel -k pull | tee errors.tmp

IFS=$'\n' ERRORS=($(grep "^GITERROR:" errors.tmp))
rm errors.tmp

for i in "${ERRORS[@]}"; do
   echo $i
done

      



You will see that even if there are 4 directories, it only takes 5 seconds for the whole script - despite executing 4 lots sleep 5

.

+1


source


Enter the list by adding /. Spawns a shell in parallel, which is written to the directory. If git pull failed to print magic line. All production is also stored as copies in the / 1 / * output. When all the pulls are complete, check in which files the magic lines occur and print the STDOUT / STDERR of those commands. Cleanup.

parallel --results out 'cd {} && (git pull || echo e_R_r_O_r)' :::  extensions/*/
grep -l e_R_r_O_r out/*/stdout | parallel 'grep -v e_R_r_O_r {//}/stdout; cat {//}/stderr >&2'
rm -r out

      

0


source







All Articles