How to convert a ASCII NULL (NUL) into single spac

2019-05-30 09:05发布

When I BCP the data in sql serverOutput file after BCP

In the output file I am getting a NUL like character in the output file, and i want to replace this with the single blank space.

When I used the below sed command it removes the NUL character but between those 2 delimiter we don't have single space.

sed 's/\x0/ /g' output file name

Example: After sed command i am getting output file like below

PHMO||P00000005233
PHMO||P00000005752

But i need a single spacing in between those delimiter as

PHMO| |P00000005233
PHMO| |P00000005752

3条回答
混吃等死
2楼-- · 2019-05-30 09:40

This is an easy job for sed. Let's start creating a test file as you didn't provide one:

$ echo -e "one,\x00,two,\x00,three" > a
$ echo -e "four,\x00,five,\x00,six" >> a

As you can see it contains ASCII 0:

$ od -c a
0000000   o   n   e   ,  \0   ,   t   w   o   ,  \0   ,   t   h   r   e
0000020   e  \n   f   o   u   r   ,  \0   ,   f   i   v   e   ,  \0   ,
0000040   s   i   x  \n
0000044

Now let's run sed:

$ sed 's/\x00/ /g' a > b

And check the output:

$ cat b
one, ,two, ,three
four, ,five, ,six
$ od -c b
0000000   o   n   e   ,       ,   t   w   o   ,       ,   t   h   r   e
0000020   e  \n   f   o   u   r   ,       ,   f   i   v   e   ,       ,
0000040   s   i   x  \n
0000044
查看更多
看我几分像从前
3楼-- · 2019-05-30 09:59

The usual approach to this would be using tr. However, solutions with tr and sed are not portable. (The question is tagged "unix", so only portable solutions are interesting).

Here is a simple demo script

#!/bin/sh
date
tr '\000' ' ' <$0.in
date
sed -e 's/\x00/ /g' <$0.in

which I named foo, and its input (with the ASCII NUL shown here as ^@):

this is a null: "^@"

Running with GNU tr and sed:

Fri Apr  1 04:41:15 EDT 2016
this is a null: " "
Fri Apr  1 04:41:15 EDT 2016
this is a null: " "

With OSX:

Fri Apr  1 04:41:53 EDT 2016
this is a null: " "
Fri Apr  1 04:41:53 EDT 2016
this is a null: "^@"

With Solaris 10 (and 11, though there may be a recent change):

Fri Apr  1 04:38:08 EDT 2016
this is a null: ""
Fri Apr  1 04:38:08 EDT 2016
this is a null: ""

Bear in mind that sed is line-oriented, and that ASCII NUL is considered a binary (non-line) character. If you want a portable solution, then other tools such as Perl (which do not have that limitation) are useful. For that case one could add this to the script:

perl -np -e 's/\0/ /g' <$0.in

The intermediate tool awk is no better in this instance. Going to Solaris again, with these lines:

for awk in awk nawk mawk gawk
do
echo "** $awk:"
$awk '{ gsub("\0"," "); print; }' <$0.in
done

I see this output:

** awk:
awk: syntax error near line 1
awk: illegal statement near line 1
** nawk:
nawk: empty regular expression
 source line number 1
 context is
        { gsub("\0"," >>>  ") <<<
** mawk:
this is a null: " "
** gawk:
this is a null: " "

Further reading:

Unlike some historical implementations, this definition of the tr utility correctly processes NUL characters in its input stream. NUL characters can be stripped by using:

tr -d '\000'

查看更多
Fickle 薄情
4楼-- · 2019-05-30 10:07

it can be done quite easily with perl

cat -v inputfile.txt
abc^@def^@ghij^@klmnop^@qrstuv^@wxyz

perl -np -e 's/\0/ /g' <inputfile.txt >outputfile.txt

cat -v outputfile.txt
abc def ghij klmnop qrstuv wxyz 
查看更多
登录 后发表回答