You are using an outdated browser. Please upgrade your browser to improve your experience and security.

Accessing Associative Arrays in GNU Parallel

Suppose the following in Bash:

declare -A ar='([one]="1" [two]="2" )'
declare -a ari='([0]="one" [1]="two")'
for i in ${!ari[@]}; do 
  echo $i ${ari[i]} ${ar[${ari[i]}]}
done
0 one 1
1 two 2

      

Is it possible to do this with GNU Parallel, making sure to use an associative array index and not a sequence? Could the fact that arrays cannot be exported make it difficult, if not impossible?

+2


source to share


3 answers


A lot has happened in 4 years. GNU Parallel 20190222 ships with env_parallel

. It is a shell feature that allows you to export most of the environment to commands run by GNU Parallel.

Supported in ash

, bash

, csh

, dash

, fish

, ksh

, mksh

, pdksh

, sh

, tcsh

and zsh

. Support varies from shell to shell (see https://www.gnu.org/software/parallel/env_parallel.html for details ). For bash

you would do:

# Load the env_parallel function
. 'which env_parallel.bash'
# Ignore variables currently defined
env_parallel --session
[... define your arrays, functions, aliases, and variables here ...]
env_parallel my_command ::: values
# The environment is also exported to remote systems if they use the same shell
(echo value1; echo value2) | env_parallel -Sserver1,server2 my_command
# Optional cleanup
env_parallel --end-session

      



So in your case, something like this:

env_parallel --session
declare -A ar='([one]="1" [two]="2" )'
declare -a ari='([0]="one" [1]="two")'
foo() {
  for i in ${!ari[@]}; do 
    echo $i ${ari[i]} ${ar[${ari[i]}]}
  done;
}
env_parallel foo ::: dummy
env_parallel --end-session

      

As you would expect, a env_parallel

little slower than clean parallel

.

+1


source


Yes, it makes it more difficult. But not impossible.

You cannot directly export the array. However, you can turn an array into a description of the same array with declare -p

, and you can store that description in an exported variable. In fact, you can store this description in a function and export the function, although this is a bit of a hack and you have to deal with the fact that executing a command declare

inside a function makes the declared variables local, so you need to inject a flag -g

into the generated functions declare

.

UPDATE . Since shellshock, the above hack doesn't work. A little variation on the theme really works. So if your bash has been up to date, go to the ShellShock Version subtitle.

So here's one possible way to generate the exported function:

make_importer () {
  local func=$1; shift; 
  export $func='() {
    '"$(for arr in $@; do
          declare -p $arr|sed '1s/declare -./&g/'
        done)"'
  }'
}

      

Now we can create our arrays and create an exported importer for them:

$ declare -A ar='([one]="1" [two]="2" )'
$ declare -a ari='([0]="one" [1]="two")'
$ make_importer ar_importer ar ari

      

And see what we've built

$ echo "$ar_importer"
() {
    declare -Ag ar='([one]="1" [two]="2" )'
declare -ag ari='([0]="one" [1]="two")'
  }

      

OK, the formatting is a bit ugly, but it's not about the space. But hack. All we have is a regular (albeit exported) variable, but when it is imported into a subshell, a bit of magic happens [Note 1]:

$ bash -c 'echo "$ar_importer"'

$ bash -c 'type ar_importer'
ar_importer is a function
ar_importer () 
{ 
    declare -Ag ar='([one]="1" [two]="2" )';
    declare -ag ari='([0]="one" [1]="two")'
}

      

And it looks prettier. Now we can run it on the command we pass parallel

:

$ printf %s\\n ${!ari[@]} |
    parallel \
      'ar_importer; echo "{}" "${ari[{}]}" "${ar[${ari[{}]}]}"'
0 one 1
1 two 2

      

Or to run on a remote computer:

$ printf %s\\n ${!ari[@]} |
    parallel -S localhost --env ar_importer \
      'ar_importer; echo "{}" "${ari[{}]}" "${ar[${ari[{}]}]}"'
0 one 1
1 two 2

      




ShellShock version.

Unfortunately, the flurry of fixes in shellshock makes it a little harder to accomplish the same task. Specifically, you now need to export the named function foo

as an environment variable named BASH_FUNC_foo%%

that is not a valid name (because of the percent sign). But we can still define a function (using eval

) and export it like this:

make_importer () {
  local func=$1; shift; 
  # An alternative to eval is:
  #    . /dev/stdin <<< ...
  # but that is neither less nor more dangerous than eval.
  eval "$func"'() {
    '"$(for arr in $@; do
          declare -p $arr|sed '1s/declare -./&g/'
        done)"'
  }'
  export -f "$func"
}

      

As above, we can build arrays and make an exporter:

$ declare -A ar='([one]="1" [two]="2" )'
$ declare -a ari='([0]="one" [1]="two")'
$ make_importer ar_importer ar ari

      

But now the function actually exists in our environment as a function:

$ type ar_importer
ar_importer is a function
ar_importer () 
{ 
    declare -Ag ar='([one]="1" [two]="2" )';
    declare -ag ari='([0]="one" [1]="two")'
}

      

Since it has been exported, we can run it on the command we pass parallel

:

$ printf %s\\n ${!ari[@]} |
    parallel \
      'ar_importer; echo "{}" "${ari[{}]}" "${ar[${ari[{}]}]}"'
0 one 1
1 two 2

      

Unfortunately it no longer works on the remote computer (at least with the version parallel

I have) because it parallel

doesn't know how to export the functions. If fixed, the following should work:

$ printf %s\\n ${!ari[@]} |
    parallel -S localhost --env ar_importer \
      'ar_importer; echo "{}" "${ari[{}]}" "${ar[${ari[{}]}]}"'

      

However, there is one important caveat : you cannot export a function from bash with a shellshock patch to bash without a patch, or vice versa. Therefore, even if parallel

commits, the remote machine must be running the same bash version as the local machine. (Or at least both or both should have the shellshock patch.)


Note 1 . The magic is that the way bash

marks the exported variable as a function, since the exported value starts with () {

. Therefore, if you export a variable that starts with these characters and is a syntactically correct function, then bash

subshells will treat it as a function. (Don't expect non-reportable subclasses, though bash

).

+3


source


GNU Parallel is a perl program. If the perl program cannot access the variables, then I don't see a way in which the variables can be passed by the perl program.

So, if you want to parallelize your loop, I see two options:

declare -A ar='([one]="1" [two]="2" )'
declare -a ari='([0]="one" [1]="two")'
for i in ${!ari[@]}; do 
  sem -j+0 echo $i ${ari[i]} ${ar[${ari[i]}]}
done

      

The solution sem

will not protect against mixed output.

declare -A ar='([one]="1" [two]="2" )'
declare -a ari='([0]="one" [1]="two")'
for i in ${!ari[@]}; do 
  echo echo $i ${ari[i]} ${ar[${ari[i]}]}
done | parallel

      

+1


source







All Articles