Why should text files end with a newline?

2018-12-31 03:49发布

I assume everyone here is familiar with the adage that all text files should end with a newline. I've known of this "rule" for years but I've always wondered — why?

18条回答
皆成旧梦
2楼-- · 2018-12-31 04:35

Each line should be terminated in a newline character, including the last one. Some programs have problems processing the last line of a file if it isn't newline terminated.

GCC warns about it not because it can't process the file, but because it has to as part of the standard.

The C language standard says A source file that is not empty shall end in a new-line character, which shall not be immediately preceded by a backslash character.

Since this is a "shall" clause, we must emit a diagnostic message for a violation of this rule.

This is in section 2.1.1.2 of the ANSI C 1989 standard. Section 5.1.1.2 of the ISO C 1999 standard (and probably also the ISO C 1990 standard).

Reference: The GCC/GNU mail archive.

查看更多
墨雨无痕
3楼-- · 2018-12-31 04:36

It may be related to the difference between:

  • text file (each line is supposed to end in an end-of-line)
  • binary file (there are no true "lines" to speak of, and the length of the file must be preserved)

If each line does end in an end-of-line, this avoids, for instance, that concatenating two text files would make the last line of the first run into the first line of the second.

Plus, an editor can check at load whether the file ends in an end-of-line, saves it in its local option 'eol', and uses that when writing the file.

A few years back (2005), many editors (ZDE, Eclipse, Scite, ...) did "forget" that final EOL, which was not very appreciated.
Not only that, but they interpreted that final EOL incorrectly, as 'start a new line', and actually start to display another line as if it already existed.
This was very visible with a 'proper' text file with a well-behaved text editor like vim, compared to opening it in one of the above editors. It displayed an extra line below the real last line of the file. You see something like this:

1 first line
2 middle line
3 last line
4
查看更多
春风洒进眼中
4楼-- · 2018-12-31 04:37

In addition to the above practical reasons, it wouldn't surprise me if the originators of Unix (Thompson, Ritchie, et al.) or their Multics predecessors realized that there is a theoretical reason to use line terminators rather than line separators: With line terminators, you can encode all possible files of lines. With line separators, there's no difference between a file of zero lines and a file containing a single empty line; both of them are encoded as a file containing zero characters.

So, the reasons are:

  1. Because that's the way POSIX defines it.
  2. Because some tools expect it or "misbehave" without it. For example, wc -l will not count a final "line" if it doesn't end with a newline.
  3. Because it's simple and convenient. On Unix, cat just works and it works without complication. It just copies the bytes of each file, without any need for interpretation. I don't think there's a DOS equivalent to cat. Using copy a+b c will end up merging the last line of file a with the first line of file b.
  4. Because a file (or stream) of zero lines can be distinguished from a file of one empty line.
查看更多
妖精总统
5楼-- · 2018-12-31 04:38

Basically there are many programs which will not process files correctly if they don't get the final EOL EOF.

GCC warns you about this because it's expected as part of the C standard. (section 5.1.1.2 apparently)

"No newline at end of file" compiler warning

查看更多
梦该遗忘
6楼-- · 2018-12-31 04:38

I've wondered this myself for years. But i came across a good reason today.

Imagine a file with a record on every line (ex: a CSV file). And that the computer was writing records at the end of the file. But it suddenly crashed. Gee was the last line complete? (not a nice situation)

But if we always terminate the last line, then we would know (simply check if last line is terminated). Otherwise we would probably have to discard the last line every time, just to be safe.

查看更多
深知你不懂我心
7楼-- · 2018-12-31 04:39

I personally like new lines at the end of source code files.

It may have its origin with Linux or all UNIX systems for that matter. I remember there compilation errors (gcc if I'm not mistaken) because source code files did not end with an empty new line. Why was it made this way one is left to wonder.

查看更多
登录 后发表回答