Assume the following in Bash:
declare -A ar='([one]="1" [two]="2" )'
declare -a ari='([0]="one" [1]="two")'
for i in ${!ari[@]}; do
echo $i ${ari[i]} ${ar[${ari[i]}]}
done
0 one 1
1 two 2
Can the same be done with GNU Parallel, making sure to use the index of the associative array, not the sequence? Does the fact that arrays can't be exported make this difficult, if not impossible?
GNU Parallel is a perl program. If the perl program cannot access the variables, then I do not see a way that the variables can be passed on by the perl program.
So if you want to parallelize the loop I see two options:
The
sem
solution will not guard against mixed output.Yes, it makes it trickier. But not impossible.
You can't export an array directly. However, you can turn an array into a description of that same array using
declare -p
, and you can store that description in an exportable variable. In fact, you can store that description in a function and export the function, although it's a bit of a hack, and you have to deal with the fact that executing adeclare
command inside a function makes the declared variables local, so you need to introduce a-g
flag into the generateddeclare
functions.UPDATE: Since shellshock, the above hack doesn't work. A small variation on the theme does work. So if your bash has been updated, please skip down to the subtitle "ShellShock Version".
So, here's one possible way of generating the exportable function:
Now we can create our arrays and build an exported importer for them:
And see what we've built
OK, the formatting is a bit ugly, but this isn't about whitespace. Here's the hack, though. All we've got there is an ordinary (albeit exported) variable, but when it gets imported into a subshell, a bit of magic happens [Note 1]:
And it looks prettier, too. Now we can run it in the command we give to
parallel
:Or, for execution on a remote machine:
ShellShock version.
Unfortunately the flurry of fixes to shellshock make it a little harder to accomplish the same task. In particular, it is now necessary to export a function named
foo
as the environment variable namedBASH_FUNC_foo%%
, which is an invalid name (because of the percent signs). But we can still define the function (usingeval
) and export it, as follows:As above, we can then build the arrays and make an exporter:
But now, the function actually exists in our environment as a function:
Since it has been exported, we can run it in the command we give to
parallel
:Unfortunately, it no longer works on a remote machine (at least with the version of
parallel
I have available) becauseparallel
doesn't know how to export functions. If this gets fixed, the following should work:However, there is one important caveat: you cannot export a function from a bash with the shellshock patch to a bash without the patch, or vice versa. So even if
parallel
gets fixed, the remote machine(s) must be running the same bash version as the local machine. (Or at least, either both or neither must have the shellshock patches.)Note 1: The magic is that the way
bash
marks an exported variable as a function is that the exported value starts exactly with() {
. So if you export a variable which starts with those characters and is a syntactically correct function, thenbash
subshells will treat it as a function. (Don't expect non-bash
subshells to understand, though.)A lot has happened in 4 years. GNU Parallel 20190222 comes with
env_parallel
. This is a shell function that makes it possible to export the most of the environment to the commands run by GNU Parallel.It is supported in
ash
,bash
,csh
,dash
,fish
,ksh
,mksh
,pdksh
,sh
,tcsh
, andzsh
. The support varies from shell to shell (see details on https://www.gnu.org/software/parallel/env_parallel.html). Forbash
you would do:So in your case something like this:
As you might expect
env_parallel
is a bit slower than pureparallel
.