How to find out line-endings in a text file?

2019-01-05 20:09发布

站内文章 / Linux

11 0

狗以群分

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I'm trying to use something in bash to show me the line endings in a file printed rather than interpreted. The file is a dump from SSIS/SQL Server being read in by a Linux machine for processing.

Are there any switches within vi, less, more, etc?
In addition to seeing the line-endings, I need to know what type of line end it is (CRLF or LF). How do I find that out?

回答1:

You can use the file utility to give you an indication of the type of line endings.

Unix:

$ file testfile1.txt
testfile.txt: ASCII text

"DOS":

$ file testfile2.txt
testfile2.txt: ASCII text, with CRLF line terminators

To convert from "DOS" to Unix:

$ dos2unix testfile2.txt

To convert from Unix to "DOS":

$ unix2dos testfile1.txt

Converting an already converted file has no effect so it's safe to run blindly (i.e. without testing the format first) although the usual disclaimers apply, as always.

回答2:

In vi...

:set list to see line-endings.

:set nolist to go back to normal.

While I don't think you can see \n or \r\n in vi, you can see which type of file it is (UNIX, DOS, etc.) to infer which line endings it has...

:set ff

Alternatively, from bash you can use od -t c <filename> or just od -c <filename> to display the returns.

回答3:

In the bash shell, try cat -v <filename>. This should display carriage-returns for windows files.

(This worked for me in rxvt via Cygwin on Windows XP).

^{Editor's note: cat -v visualizes \r (CR) chars. as ^M. Thus, line-ending \r\n sequences will display as ^M at the end of each output line. cat -e will additionally visualize \n, namely as $. (cat -et will additionally visualize tab chars. as ^I.)}

回答4:

Ubuntu 14.04:

simple cat -e <filename> works just fine.

This displays Unix line endings (\n or LF) as $ and Windows line endings (\r\n or CRLF) as ^M$.

回答5:

To show CR as ^M in less use less -u or type -u once less is open.

man less says:

-u or --underline-special

      Causes backspaces and carriage returns to be treated  as  print-
      able  characters;  that  is,  they are sent to the terminal when
      they appear in the input.

回答6:

You can use xxd to show a hex dump of the file, and hunt through for "0d0a" or "0a" chars.

You can use cat -v <filename> as @warriorpostman suggests.

回答7:

Try "file -k"

I sometimes have to check this for PEM certificate files.

The trouble with regular file is this: Sometimes it's trying to be too smart/too specific.

Let's try a little quiz: I've got some files. And one of these files has different line endings. Which one?

(By the way: this is what one of my typical "certificate work" directories looks like.)

Let's try regular file:

$ file -- *
0.example.end.cer:         PEM certificate
0.example.end.key:         PEM RSA private key
1.example.int.cer:         PEM certificate
2.example.root.cer:        PEM certificate
example.opensslconfig.ini: ASCII text
example.req:               PEM certificate request

Huh. It's not telling me the line endings. And I already knew that those were cert files. I didn't need "file" to tell me that.

What else can you try?

You might try dos2unix with the --info switch like this:

$ dos2unix --info -- *
  37       0       0  no_bom    text    0.example.end.cer
   0      27       0  no_bom    text    0.example.end.key
   0      28       0  no_bom    text    1.example.int.cer
   0      25       0  no_bom    text    2.example.root.cer
   0      35       0  no_bom    text    example.opensslconfig.ini
   0      19       0  no_bom    text    example.req

So that tells you that: yup, "0.example.end.cer" must be the odd man out. But what kind of line endings are there? Do you know the dos2unix output format by heart? (I don't.)

But fortunately there's the --keep-going (or -k for short) option in file:

$ file --keep-going -- *
0.example.end.cer:         PEM certificate\012- , ASCII text, with CRLF line terminators\012- data
0.example.end.key:         PEM RSA private key\012- , ASCII text\012- data
1.example.int.cer:         PEM certificate\012- , ASCII text\012- data
2.example.root.cer:        PEM certificate\012- , ASCII text\012- data
example.opensslconfig.ini: ASCII text\012- data
example.req:               PEM certificate request\012- , ASCII text\012- data

Excellent! Now we know that our odd file has DOS (CRLF) line endings. (And the other files have Unix (LF) line endings. This is not explicit in this output. It's implicit. It's just the way file expects a "regular" text file to be.)

(If you wanna share my mnemonic: "L" is for "Linux" and for "LF".)

Now let's convert the culprit and try again:

$ dos2unix -- 0.example.end.cer

$ file --keep-going -- *
0.example.end.cer:         PEM certificate\012- , ASCII text\012- data
0.example.end.key:         PEM RSA private key\012- , ASCII text\012- data
1.example.int.cer:         PEM certificate\012- , ASCII text\012- data
2.example.root.cer:        PEM certificate\012- , ASCII text\012- data
example.opensslconfig.ini: ASCII text\012- data
example.req:               PEM certificate request\012- , ASCII text\012- data

Good. Now all certs have Unix line endings.

回答8:

You may use the command todos filename to convert to DOS endings, and fromdos filename to convert to UNIX line endings. To install the package on Ubuntu, type sudo apt-get install tofrodos.

回答9:

You can use vim -b filename to edit a file in binary mode, which will show ^M characters for carriage return and a new line is indicative of LF being present, indicating Windows CRLF line endings. By LF I mean \n and by CR I mean \r. Note that when you use the -b option the file will always be edited in UNIX mode by default as indicated by [unix] in the status line, meaning that if you add new lines they will end with LF, not CRLF. If you use normal vim without -b on a file with CRLF line endings, you should see [dos] shown in the status line and inserted lines will have CRLF as end of line. The vim documentation for fileformats setting explains the complexities.

Also, I don't have enough points to comment on the Notepad++ answer, but if you use Notepad++ on Windows, use the View / Show Symbol / Show End of Line menu to display CR and LF. In this case LF is shown whereas for vim the LF is indicated by a new line.