In various bash scripts I have come across the following: $'\0'
An example with some context:
while read -r -d $'\0' line; do
echo "${line}"
done <<< "${some_variable}"
What does $'\0' return as its value? Or, stated slightly differently, what does $'\0' evaluate to and why?
It is possible that this has been answered elsewhere. I did search prior to posting but the limited number of characters or meaningful words in dollar-quote-slash-zero-quote makes it very hard to get results from stackoverflow search or google. So, if there are other duplicate questions, please allow some grace and link them from this question.
In bash,
$'\0'
is precisely the same as''
: an empty string. There is absolutely no point in using the special Bash syntax in this case.Bash strings are always NUL-terminated, so if you manage to insert a NUL into the middle of a string, it will terminate the string. In this case, the C-escape
\0
is converted to a NUL character, which then acts as a string terminator.The
-d
option of theread
builtin (which defines a line-end character the input) expects a single character in its argument. It does not check if that character is the NUL character, so it will be equally happy using the NUL terminator of''
or the explicit NUL in$'\0'
(which is also a NUL terminator, so it is probably no different). The effect, in either case, will be to read NUL-terminated data, as produced (for example) byfind
's-print0
option.In the specific case of
read -d '' line <<< "$var'
, it is impossible for$var
to have an internal NUL character (for the reasons described above), soline
will be set to the entire value of$var
with leading and trailing whitespace removed. (As @mklement notes, this will not be apparent in the suggested code snippet, becauseread
will have a non-zero exit status, even though the variable will have been set;read
only returns success if the delimiter is actually found, and NUL cannot be part of a here-string.)Note that there is a big difference between
and
The first one is correct. In the second one, the argument word passed to
read
is just-d
, which means that the option will be the next argument (in this case,line
).read -d$'\0' line
will have identical behaviour; in either case, the space is necessary. (So, again, no need for the C-escape syntax).$'\0'
expands the contained escape sequence\0
to the actual characters they represent which is\0
or an empty character in shell.This is BASH syntax. As per
man BASH
:Similarly
$'\n'
expands to a newline and$'\r'
will expand to a carriage return.It is technically true that the expansion
$'\0'
will always become the empty string''
(a.k.a. the null string) to the shell (not in zsh). Or, worded the other way around, a$'\0'
will never expand to an asciiNUL
(or byte with zero value), (again, not in zsh). It should be noted that it is confusing that both names are quite similar:NUL
andnull
.However, there is an aditional (quite confusing) twist when we talk about
read -d ''
.What
read
see is the value''
(the null string) as the delimiter.What
read
does is split the input from stdin on the character$'\0'
(yes an actual0x00
).Expanded answer.
The question in the tittle is:
That means that we need to explain what
$'\0'
is expanded to.What
$'\0'
is expanded to is very easy: it expands to the null string''
(in most shells, not in zsh).But the example of use is:
That transform the question to: what delimiter character does $'\0' expand to ?
This holds a very confusing twist. To address that correctly, we need to take a full circle tour of when and how a NUL (a byte with zero value or '0x00') is used in shells.
Stream.
We need some NUL to work with. It is possible to generate NUL bytes from shell:
Variable.
A variable in shell will not store a NUL.
The example is meant to be executed in bash as only bash printf has the
-v
option. But the example is clear to show that a string that contains a NUL will be cut at the NUL. Simple variables will cut the string at the zero byte. As is reasonable to expect if the string is a C string, which must end on a NUL\0
. As soon as a NUL is found the string must end.Command substitution.
A NUL will work differently when used in a command substitution. This code should assign a value to the variable
$a
and then print it:And it does, but with different results in different shells:
It is of special mention that bash (version 4.4) warns about the fact:
In command substitution the zero byte is silently ignored by the shell.
It is very important to understand that that does not happen in zsh.
Now that we have all the pieces about NUL. We may look at what read does.
What
read
do on NUL delimiter.That brings us back to the command
read -d $'\0'
:The
$'\0'
shoud have been expanded to a byte of value0x00
, but the shell cuts it and it actually becomes''
. That means that both$'\0'
and''
are received by read as the same value.Having said that, it may seem reasonable to write the equivalent construct:
And it is technically correct.
What a delimiter of '' actually does.
There are two sides of this point, one that is the character after the -d option of read, the other one, which is addressed here, is: what character will read use if given a delimiter as
-d $'\0'
?.The first side has been answered in detail above.
The second side is very confusing twist as the command
read
will actually read up to the next byte of value0x00
(which is what$'\0'
represents).To actually show that that is the case:
when executed, the output will be:
The first two
exit 0
are successfully reads done up to the next "zero byte", and both contain the correct values ofab
andcd
. The next read is the last one (as there are no more zero bytes) and contains the value $'ef\ngh' (yes, it also contains a new line).All this goes to show (and prove) that
read -d ''
actually reads up to the next "zero byte", which is also known by the ascii nameNUL
and should have been the result of a$'\0'
expansion.In short: we can safely state that
read -d ''
reads up to the next0x00
(NUL).Conclusion:
We must state that a
read -d $'\0'
will expand to a delimiter of0x00
. Using$'\0'
is a better way to transmit to the reader this correct meaning. As a code style thing: I write $'\0' to make my intentions clear.One, and only one, character used as a delimiter: the byte value of
0x00
(even if in bash it happens to be cut)Note: Either this commands will print the hex values of the stream.
To complement rici's helpful answer:
Note that this answer is about
bash
.ksh
andzsh
also support$'...'
strings, but their behavior differs:*
zsh
does create and preserve NUL (null bytes) with$'\0'
.*
ksh
, by contrast, has the same limitations asbash
, and additionally interprets the first NUL in a command substitution's output as the string terminator (cuts off at the first NUL, whereasbash
strips such NULs).$'\0'
is an ANSI C-quoted string that technically creates a NUL (0x0
byte), but effectively results in the empty (null) string (same as''
), because any NUL is interpreted as the (C-style) string terminator by Bash in the context of arguments and here-docs/here-strings.As such, it is somewhat misleading to use
$'\0'
because it suggests that you can create a NUL this way, when you actually cannot:You cannot create NULs as part of a command argument or here-doc / here-string, and you cannot store NULs in a variable:
echo $'a\0b' | cat -v # -> 'a'
- string terminated after 'a'cat -v <<<$'a\0b' # -> 'a'
- dittoIn the context of command substitutions, by contrast, NULs are stripped:
echo "$(printf 'a\0b')" | cat -v # -> 'ab'
- NUL is strippedHowever, you can pass NUL bytes via files and pipes.
printf 'a\0b' | cat -v # -> 'a^@b'
- NUL is preserved, via stdout and pipeprintf
that is generating the NUL via its single-quoted argument whose escape sequencesprintf
then interprets and writes to stdout. By contrast, if you usedprintf $'a\0b'
,bash
would again interpret the NUL as the string terminator up front and pass only'a'
toprintf
.If we examine the sample code, whose intent is to read the entire input at once, across lines (I've therefore changed
line
tocontent
):This will never enter the
while
loop body, because stdin input is provided by a here-string, which, as explained, cannot contain NULs.Note that
read
actually does look for NULs with-d $'\0'
, even though$'\0'
is effectively''
. In other words:read
by convention interprets the empty (null) string to mean NUL as-d
's option-argument, because NUL itself cannot be specified for technical reasons.In the absence of an actual NUL in the input,
read
's exit code indicates failure, so the loop is never entered.However, even in the absence of the delimiter, the value is read, so to make this code work with a here-string or here-doc, it must be modified as follows:
However, as @rici notes in a comment, with a single (multi-line) input string, there is no need to use
while
at all:This reads the entire content of
$some_variable
, while trimming leading and trailing whitespace (which is whatread
does with$IFS
at its default value,$' \t\n'
).@rici also points out that if such trimming weren't desired, a simple
content=$some_variable
would do.Contrast this with input that actually contains NULs, in which case
while
is needed to process each NUL-separated token (but without the|| [[ -n $<var> ]]
clause);find -print0
outputs filenames separated by a NUL each):Note the use of
IFS= read ...
to suppress trimming of leading and trailing whitespace, which is undesired in this case, because input filenames must be preserved as-is.