While trying to process a list of file-/foldernames correctly (see my other questions) through the use of a NULL-character as a delimiter I stumbled over a strange behaviour of Bash that I don't understand:
When assigning a string containing one or more NULL-character to a variable, the NULL-characters are lost / ignored / not stored.
For example,
echo -ne "n\0m\0k" | od -c # -> 0000000 n \0 m \0 k
But:
VAR1=`echo -ne "n\0m\0k"`
echo -ne "$VAR1" | od -c # -> 0000000 n m k
This means that I would need to write that string to a file (for example, in /tmp) and read it back from there if piping directly is not desired or feasible.
When executing these scripts in Z shell (zsh) the strings containing \0 are preserved in both cases, but sadly I can't assume that zsh is present in the systems running my script while Bash should be.
How can strings containing \0 chars be stored or handled efficiently without losing any (meta-) characters?
I love jeff's answer. I would use Base64 encoding instead of xxd. It saves a little space and would be (I think) more recognizable as to what is intended.
As for -e, it is not needed because the shell already interprets the escape before it even gets to echo. I also seem to recall something about "echo -e" being unsafe if you're echoing any user input as they could inject escape sequences that echo will interpret and end up with bad things.
In Bash, you can't store the NULL-character in a variable.
You may, however, store a plain hex dump of the data (and later reverse this operation again) by using the
xxd
command.As others have already stated, you can't store/use NUL char:
However, you can handle any binary data (including NUL char):
So to answer your last question:
You can use files or pipes to store and handle efficiently any string with any meta-characters.
If you plan to handle data, you should note additionally that:
$(command..)
or`command..`
) has an additional twist above being a variable as it'll eat your ending new lines.Bypassing limitations
If you want to use variables, then you must get rid of the NUL char by encoding it, and various other solutions here give clever ways to do that (an obvious way is to use for example base64 encoding/decoding).
If you are concerned by memory or speed, you'll probably want to use a minimal parser and only quote NUL character (and the quoting char). In this case this would help you:
Then, you can secure your data before storing them in variables and command line argument by piping your sensitive data into
quote
, which will output a safe data stream without NUL chars. You can get back the original string (with NUL chars) by usingecho -en "$var_quoted"
which will send the correct string on the standard output.Example:
Note: use
| hd
to get a clean view of your data in hexadecimal and check that you didn't loose any NUL chars.Changing tools
Remember you can go pretty far with pipes without using variables nor argument in command line, don't forget for instance the
<(command ...)
construct that will create a named pipe (sort of a temporary file).EDIT: the first implementation of
quote
was incorrect and would not deal correctly with\
special characters interpreted byecho -en
. Thanks @xhienne for spotting that.Use
uuencode
anduudecode
for POSIX portabilityxxd
andbase64
are not POSIX 7 but uuencode is.Output:
Unfortunately I don't see a POSIX 7 alternative for the Bash process
<()
substitution extension except writing to file, and they are not installed in Ubuntu 12.04 by default (sharutils
package).So I guess that the real answer is: don't use Bash for this, use Python or some other saner interpreted language.