When I BCP the data in sql server
In the output file I am getting a NUL like character in the output file, and i want to replace this with the single blank space.
When I used the below sed
command it removes the NUL character but between those 2 delimiter we don't have single space.
sed 's/\x0/ /g' output file name
Example: After sed command i am getting output file like below
PHMO||P00000005233
PHMO||P00000005752
But i need a single spacing in between those delimiter as
PHMO| |P00000005233
PHMO| |P00000005752
The usual approach to this would be using tr
. However, solutions with tr
and sed
are not portable. (The question is tagged "unix", so only portable solutions are interesting).
Here is a simple demo script
#!/bin/sh
date
tr '\000' ' ' <$0.in
date
sed -e 's/\x00/ /g' <$0.in
which I named foo
, and its input (with the ASCII NUL shown here as ^@
):
this is a null: "^@"
Running with GNU tr
and sed
:
Fri Apr 1 04:41:15 EDT 2016
this is a null: " "
Fri Apr 1 04:41:15 EDT 2016
this is a null: " "
With OSX:
Fri Apr 1 04:41:53 EDT 2016
this is a null: " "
Fri Apr 1 04:41:53 EDT 2016
this is a null: "^@"
With Solaris 10 (and 11, though there may be a recent change):
Fri Apr 1 04:38:08 EDT 2016
this is a null: ""
Fri Apr 1 04:38:08 EDT 2016
this is a null: ""
Bear in mind that sed
is line-oriented, and that ASCII NUL is considered a binary (non-line) character. If you want a portable solution, then other tools such as Perl (which do not have that limitation) are useful. For that case one could add this to the script:
perl -np -e 's/\0/ /g' <$0.in
The intermediate tool awk
is no better in this instance. Going to Solaris again, with these lines:
for awk in awk nawk mawk gawk
do
echo "** $awk:"
$awk '{ gsub("\0"," "); print; }' <$0.in
done
I see this output:
** awk:
awk: syntax error near line 1
awk: illegal statement near line 1
** nawk:
nawk: empty regular expression
source line number 1
context is
{ gsub("\0"," >>> ") <<<
** mawk:
this is a null: " "
** gawk:
this is a null: " "
Further reading:
- sed - stream editor (POSIX)
- tr - translate characters (POSIX), which notes
Unlike some historical implementations, this definition of the tr utility correctly processes NUL characters in its input stream. NUL characters can be stripped by using:
tr -d '\000'
- perlrun - how to execute the Perl interpreter
This is an easy job for sed. Let's start creating a test file as you didn't provide one:
$ echo -e "one,\x00,two,\x00,three" > a
$ echo -e "four,\x00,five,\x00,six" >> a
As you can see it contains ASCII 0:
$ od -c a
0000000 o n e , \0 , t w o , \0 , t h r e
0000020 e \n f o u r , \0 , f i v e , \0 ,
0000040 s i x \n
0000044
Now let's run sed:
$ sed 's/\x00/ /g' a > b
And check the output:
$ cat b
one, ,two, ,three
four, ,five, ,six
$ od -c b
0000000 o n e , , t w o , , t h r e
0000020 e \n f o u r , , f i v e , ,
0000040 s i x \n
0000044
it can be done quite easily with perl
cat -v inputfile.txt
abc^@def^@ghij^@klmnop^@qrstuv^@wxyz
perl -np -e 's/\0/ /g' <inputfile.txt >outputfile.txt
cat -v outputfile.txt
abc def ghij klmnop qrstuv wxyz