Perl (or something else) - ^M problem

2019-09-07 14:00发布

问题:

I'm trying to add " at beginning and ", at end of each non-empty line of text file in Perl.

perl -pi -e 's/^(.+)$/\"$1\",/g' something.txt

It adds " at beginning of each non-empty line, but i have problem with ",.

Example input:

bla
bla bla
blah

That's output i'm getting:

"bla
",
"bla bla
",
"blah
",

And that's output i actually want:

"bla",
"bla bla",
"blah",

How do I fix this?

Edit: I opened my output file in vim now (I opened it in kwrite before so it wasn't visible) and I noticed vim shows ^M before each ", - I don't know what in code adds this.

回答1:

Looks like a line ending problem - did you edit the file in windows? Try dos2unix

If you don't want to use dos2unix you can match for the \r:

perl -pi -e 's/^(.+)\r$/\"$1\",/g'

The problem is that if you have returns in the file it will match them in .* so you'll get:

"bla^M",
"bla bla^M",
"blah^M",


回答2:

Your data file must have originated on Windows, which uses CRLF as a line delimiter instead of just LF. This means your text file looks like this:

bla[CR][LF]bla bla[CR][LF]blah[CR][LF]

You can verify this by using od -c something.txt.

$ od -c something.txt
0000000    b   l   a  \r  \n   b   l   a       b   l   a  \r  \n   b   l
0000020    a   h  \r  \n                                                
0000024

Under Unix or Linux, it will appear like this:

bla\r
bla bla\r
blah\r

When perl makes it's substitution, it results in this:

"bla\r",
"bla bla\r",
"blah\r",

And when you cat the result, you get what you see:

"bla
",
"bla bla
",
"blah
",

The easy thing to do is to use dos2unix to convert the line endings to Unix format, then your scripts will behave as expected.



回答3:

On systems that use CRLF text files, Perl uses an IO layer to filter the CRLF to that we only see an LF in our scripts. However, if you open a CRLF file on a system that does not use CRLF normally, you can enable the CRLF translation in a number of ways.

You can use binmode. I use the OO interface here because I think it is cleaner, YMMV:

use IO::File;

open( my $fh, '<', 'winfile.txt' ) 
    or die "Oh poo - $!\n";

$fh->binmode(':crlf');

You can also use a tweaked open:

open( my $fh, '<:crlf', 'winfile.txt' ) 
    or die "Oh poo - $!\n";

Or for your one-liner you can set the PERLIO environment variable (see PerlIO):

PERLIO=crlf perl -pi -e 's/^(.+)$/\"$1\",/g' something.txt

Of course, this approach will preserve the CRLF line endings in the processed file--which may or may not be what you want.



回答4:

sed 's/.\{1,\}/"&",/'

This was asked before python or bash - adding " at beginning of line and ", at end of line



回答5:

since you want to add at beginning and end, you don't a regex substitution for that simple task.

perl -ne 'chomp;print "\"".$_."\",\n"' file


标签: linux perl