Perl (or something else) - ^M problem

2019-09-07 13:47发布

I'm trying to add " at beginning and ", at end of each non-empty line of text file in Perl.

perl -pi -e 's/^(.+)$/\"$1\",/g' something.txt

It adds " at beginning of each non-empty line, but i have problem with ",.

Example input:

bla
bla bla
blah

That's output i'm getting:

"bla
",
"bla bla
",
"blah
",

And that's output i actually want:

"bla",
"bla bla",
"blah",

How do I fix this?

Edit: I opened my output file in vim now (I opened it in kwrite before so it wasn't visible) and I noticed vim shows ^M before each ", - I don't know what in code adds this.

标签: linux perl
5条回答
Fickle 薄情
3楼-- · 2019-09-07 14:24

since you want to add at beginning and end, you don't a regex substitution for that simple task.

perl -ne 'chomp;print "\"".$_."\",\n"' file
查看更多
男人必须洒脱
4楼-- · 2019-09-07 14:28

Looks like a line ending problem - did you edit the file in windows? Try dos2unix

If you don't want to use dos2unix you can match for the \r:

perl -pi -e 's/^(.+)\r$/\"$1\",/g'

The problem is that if you have returns in the file it will match them in .* so you'll get:

"bla^M",
"bla bla^M",
"blah^M",
查看更多
ら.Afraid
5楼-- · 2019-09-07 14:31

On systems that use CRLF text files, Perl uses an IO layer to filter the CRLF to that we only see an LF in our scripts. However, if you open a CRLF file on a system that does not use CRLF normally, you can enable the CRLF translation in a number of ways.

You can use binmode. I use the OO interface here because I think it is cleaner, YMMV:

use IO::File;

open( my $fh, '<', 'winfile.txt' ) 
    or die "Oh poo - $!\n";

$fh->binmode(':crlf');

You can also use a tweaked open:

open( my $fh, '<:crlf', 'winfile.txt' ) 
    or die "Oh poo - $!\n";

Or for your one-liner you can set the PERLIO environment variable (see PerlIO):

PERLIO=crlf perl -pi -e 's/^(.+)$/\"$1\",/g' something.txt

Of course, this approach will preserve the CRLF line endings in the processed file--which may or may not be what you want.

查看更多
三岁会撩人
6楼-- · 2019-09-07 14:33

Your data file must have originated on Windows, which uses CRLF as a line delimiter instead of just LF. This means your text file looks like this:

bla[CR][LF]bla bla[CR][LF]blah[CR][LF]

You can verify this by using od -c something.txt.

$ od -c something.txt
0000000    b   l   a  \r  \n   b   l   a       b   l   a  \r  \n   b   l
0000020    a   h  \r  \n                                                
0000024

Under Unix or Linux, it will appear like this:

bla\r
bla bla\r
blah\r

When perl makes it's substitution, it results in this:

"bla\r",
"bla bla\r",
"blah\r",

And when you cat the result, you get what you see:

"bla
",
"bla bla
",
"blah
",

The easy thing to do is to use dos2unix to convert the line endings to Unix format, then your scripts will behave as expected.

查看更多
登录 后发表回答