Accessing Associative Arrays in GNU Parallel
Suppose the following in Bash:
declare -A ar='([one]="1" [two]="2" )'
declare -a ari='([0]="one" [1]="two")'
for i in ${!ari[@]}; do
echo $i ${ari[i]} ${ar[${ari[i]}]}
done
0 one 1
1 two 2
Is it possible to do this with GNU Parallel, making sure to use an associative array index and not a sequence? Could the fact that arrays cannot be exported make it difficult, if not impossible?
source to share
A lot has happened in 4 years. GNU Parallel 20190222 ships with env_parallel
. It is a shell feature that allows you to export most of the environment to commands run by GNU Parallel.
Supported in ash
, bash
, csh
, dash
, fish
, ksh
, mksh
, pdksh
, sh
, tcsh
and zsh
. Support varies from shell to shell (see https://www.gnu.org/software/parallel/env_parallel.html for details ). For bash
you would do:
# Load the env_parallel function
. 'which env_parallel.bash'
# Ignore variables currently defined
env_parallel --session
[... define your arrays, functions, aliases, and variables here ...]
env_parallel my_command ::: values
# The environment is also exported to remote systems if they use the same shell
(echo value1; echo value2) | env_parallel -Sserver1,server2 my_command
# Optional cleanup
env_parallel --end-session
So in your case, something like this:
env_parallel --session
declare -A ar='([one]="1" [two]="2" )'
declare -a ari='([0]="one" [1]="two")'
foo() {
for i in ${!ari[@]}; do
echo $i ${ari[i]} ${ar[${ari[i]}]}
done;
}
env_parallel foo ::: dummy
env_parallel --end-session
As you would expect, a env_parallel
little slower than clean parallel
.
source to share
Yes, it makes it more difficult. But not impossible.
You cannot directly export the array. However, you can turn an array into a description of the same array with declare -p
, and you can store that description in an exported variable. In fact, you can store this description in a function and export the function, although this is a bit of a hack and you have to deal with the fact that executing a command declare
inside a function makes the declared variables local, so you need to inject a flag -g
into the generated functions declare
.
UPDATE . Since shellshock, the above hack doesn't work. A little variation on the theme really works. So if your bash has been up to date, go to the ShellShock Version subtitle.
So here's one possible way to generate the exported function:
make_importer () {
local func=$1; shift;
export $func='() {
'"$(for arr in $@; do
declare -p $arr|sed '1s/declare -./&g/'
done)"'
}'
}
Now we can create our arrays and create an exported importer for them:
$ declare -A ar='([one]="1" [two]="2" )'
$ declare -a ari='([0]="one" [1]="two")'
$ make_importer ar_importer ar ari
And see what we've built
$ echo "$ar_importer"
() {
declare -Ag ar='([one]="1" [two]="2" )'
declare -ag ari='([0]="one" [1]="two")'
}
OK, the formatting is a bit ugly, but it's not about the space. But hack. All we have is a regular (albeit exported) variable, but when it is imported into a subshell, a bit of magic happens [Note 1]:
$ bash -c 'echo "$ar_importer"'
$ bash -c 'type ar_importer'
ar_importer is a function
ar_importer ()
{
declare -Ag ar='([one]="1" [two]="2" )';
declare -ag ari='([0]="one" [1]="two")'
}
And it looks prettier. Now we can run it on the command we pass parallel
:
$ printf %s\\n ${!ari[@]} |
parallel \
'ar_importer; echo "{}" "${ari[{}]}" "${ar[${ari[{}]}]}"'
0 one 1
1 two 2
Or to run on a remote computer:
$ printf %s\\n ${!ari[@]} |
parallel -S localhost --env ar_importer \
'ar_importer; echo "{}" "${ari[{}]}" "${ar[${ari[{}]}]}"'
0 one 1
1 two 2
ShellShock version.
Unfortunately, the flurry of fixes in shellshock makes it a little harder to accomplish the same task. Specifically, you now need to export the named function foo
as an environment variable named BASH_FUNC_foo%%
that is not a valid name (because of the percent sign). But we can still define a function (using eval
) and export it like this:
make_importer () {
local func=$1; shift;
# An alternative to eval is:
# . /dev/stdin <<< ...
# but that is neither less nor more dangerous than eval.
eval "$func"'() {
'"$(for arr in $@; do
declare -p $arr|sed '1s/declare -./&g/'
done)"'
}'
export -f "$func"
}
As above, we can build arrays and make an exporter:
$ declare -A ar='([one]="1" [two]="2" )'
$ declare -a ari='([0]="one" [1]="two")'
$ make_importer ar_importer ar ari
But now the function actually exists in our environment as a function:
$ type ar_importer
ar_importer is a function
ar_importer ()
{
declare -Ag ar='([one]="1" [two]="2" )';
declare -ag ari='([0]="one" [1]="two")'
}
Since it has been exported, we can run it on the command we pass parallel
:
$ printf %s\\n ${!ari[@]} |
parallel \
'ar_importer; echo "{}" "${ari[{}]}" "${ar[${ari[{}]}]}"'
0 one 1
1 two 2
Unfortunately it no longer works on the remote computer (at least with the version parallel
I have) because it parallel
doesn't know how to export the functions. If fixed, the following should work:
$ printf %s\\n ${!ari[@]} |
parallel -S localhost --env ar_importer \
'ar_importer; echo "{}" "${ari[{}]}" "${ar[${ari[{}]}]}"'
However, there is one important caveat : you cannot export a function from bash with a shellshock patch to bash without a patch, or vice versa. Therefore, even if parallel
commits, the remote machine must be running the same bash version as the local machine. (Or at least both or both should have the shellshock patch.)
Note 1 . The magic is that the way bash
marks the exported variable as a function, since the exported value starts with () {
. Therefore, if you export a variable that starts with these characters and is a syntactically correct function, then bash
subshells will treat it as a function. (Don't expect non-reportable subclasses, though bash
).
source to share
GNU Parallel is a perl program. If the perl program cannot access the variables, then I don't see a way in which the variables can be passed by the perl program.
So, if you want to parallelize your loop, I see two options:
declare -A ar='([one]="1" [two]="2" )'
declare -a ari='([0]="one" [1]="two")'
for i in ${!ari[@]}; do
sem -j+0 echo $i ${ari[i]} ${ar[${ari[i]}]}
done
The solution sem
will not protect against mixed output.
declare -A ar='([one]="1" [two]="2" )'
declare -a ari='([0]="one" [1]="two")'
for i in ${!ari[@]}; do
echo echo $i ${ari[i]} ${ar[${ari[i]}]}
done | parallel
source to share