I have a text file containing unwanted null characters (ASCII NUL, \0
). When I try to view it in vi
I see ^@
symbols, interleaved in normal text. How can I:
Identify which lines in the file contain null characters? I have tried grepping for
\0
and\x0
, but this did not work.Remove the null characters? Running
strings
on the file cleaned it up, but I'm just wondering if this is the best way?
Here is example how to remove NULL characters using
ex
(in-place):and for multiple files:
For recursivity, you may use globbing option
**/*.txt
(if it is supported by your shell).Useful for scripting since
sed
and its-i
parameter is a non-standard BSD extension.See also: How to check if the file is a binary file and read all the files which are not?
I discovered the following, which prints out which lines, if any, have null characters:
Also, an octal dump can tell you if there are nulls:
I used:
to get rid of zeroes in file.
A large number of unwanted NUL characters, say one every other byte, indicates that the file is encoded in UTF-16 and that you should use
iconv
to convert it to UTF-8.I faced the same error with:
I solved the problem by changing the encoding to
utf-16
If the lines in the file end with \r\n\000 then what works is to delete the \n\000 then replace the \r with \n.