Identifying and removing null characters in UNIX

I have a text file containing unwanted null characters (ASCII NUL, \0). When I try to view it in vi I see ^@ symbols, interleaved in normal text. How can I:

Identify which lines in the file contain null characters? I have tried grepping for \0 and \x0, but this did not work.
Remove the null characters? Running strings on the file cleaned it up, but I'm just wondering if this is the best way?

标签： unix shell null special-characters

8条回答

兄弟一词,经得起流年.

2楼-- · 2019-01-06 10:21

Here is example how to remove NULL characters using ex (in-place):

ex -s +"%s/\%x00//g" -cwq nulls.txt

and for multiple files:

ex -s +'bufdo!%s/\%x00//g' -cxa *.txt

^{For recursivity, you may use globbing option **/*.txt (if it is supported by your shell).}

Useful for scripting since sed and its -i parameter is a non-standard BSD extension.

0人赞添加讨论(0) 举报

forever°为你锁心

3楼-- · 2019-01-06 10:24

I discovered the following, which prints out which lines, if any, have null characters:

perl -ne '/\000/ and print;' file-with-nulls

Also, an octal dump can tell you if there are nulls:

od file-with-nulls | grep ' 000'

0人赞添加讨论(0) 举报

在下西门庆

4楼-- · 2019-01-06 10:25

I used:

recode UTF-16..UTF-8 <filename>

to get rid of zeroes in file.

0人赞添加讨论(0) 举报

小情绪 Triste *

5楼-- · 2019-01-06 10:32

A large number of unwanted NUL characters, say one every other byte, indicates that the file is encoded in UTF-16 and that you should use iconv to convert it to UTF-8.

0人赞添加讨论(0) 举报

干净又极端

6楼-- · 2019-01-06 10:33

I faced the same error with:

import codecs as cd
f=cd.open(filePath,'r','ISO-8859-1')

I solved the problem by changing the encoding to utf-16

f=cd.open(filePath,'r','utf-16')

0人赞添加讨论(0) 举报

Fickle 薄情

7楼-- · 2019-01-06 10:34

If the lines in the file end with \r\n\000 then what works is to delete the \n\000 then replace the \r with \n.

tr -d '\n\000' <infile | tr '\r' '\n' >outfile

0人赞添加讨论(0) 举报

1 2 下一页

Identifying and removing null characters in UNIX

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间