Historical reason behind different line ending at

2019-01-09 02:35发布

Why did DOS/Windows and Mac decide to use \r\n and \r for line ending instead of \n? Was it just a result of trying to be "different" from Unix?

And now that Mac OS X is Unix (-like), did Apple switch to \n from \r?

4条回答
贪生不怕死
2楼-- · 2019-01-09 03:10

Really adding to @Mark Harrison...

The people who tell you that Unix is "just outputting the text the programmer specified" whereas DOS is broken are plain wrong. There are also claims that it's stupid for DOS to flag EOF when it sees an EOF character, raising the question of what exactly that EOF character is for.

There is no one true convention for text file line endings - only platform-specific conventions. After all, even CR-LF, CR and LF aren't the only line end conventions to ever be used, and ASCII was never even the one and only character set. The problem is the C standard library and runtime, which didn't abstract away this platform-dependent detail. Other third generation languages (such as Pascal and even Basic) managed it, at least to some degree. Because of this, when C compilers were written for other platforms, runtime library hacks were needed to achieve compatibility with existing source code and books.

In fact, it's Unix and Multics that originally needed string translation for console I/O, since users usually sat at an ASCII terminal that required CR LF line ends. This translation was done in a device driver, though - the goal was to abstract away the device-specifics, assuming that it was better to adopt one convention and stick to it for stored text files.

The C text I/O hack is similar in principle to what CygWin does now, hacking Linux runtimes to work as well as can be expected on Windows. There's a real history of hacking things about to turn them into Unix-alikes - but then there's also Wine, turning Linux into Windows. Oddly enough, you can read some misplaced line-end criticism of Windows in the CygWin FAQ (Internet Archive link added 2013 - the page no longer exists). Maybe it's just their sense of humour, since they are basically doing what they are criticising, but on a much grander scale ;-)

The C++ standard library (whatever platform its implemented on) avoids this issue using iostreams, which abstract away line ends. For output, that suits me fine. For input, I need more control, so I either interpret character-by-character or else use a scanner generator.

[EDIT It turns out that the struck-out claim above isn't true, and never was. The std::endl literally translates to a \n and a flush. The \n is exactly the same \n you get in C - it tends to get called "new line", but it's actually an ASCII line feed character, which then gets translated by the runtime if necessary. Funny how false assumptions can get so ingrained you never question them - basically, C++ had no choice to do what C did (other than adding more layers on top) for compatibility reasons, and that should always have been obvious.]

The biggest slice of blame from my POV is with C, but C isn't the only project to fail to anticipate its move to other platforms. Blaming Bill Gates is just nuts - all he did was buy and polish a variant of the then popular CP/M. Really, it's just history - the same reason why we don't know what character codes 128 to 255 refer to in most text files. Given the ease of coping with all three line end conventions, it's odd that some developers still insist on that "my platforms convention is the one true way, and I shall force it on you like it or not" attitude.

Also - will the Unicode line separator codepoint U+2028 replace all these conventions in future text files? ;-)

查看更多
够拽才男人
3楼-- · 2019-01-09 03:10

There's a rather lengthy article about line endings on wikipedia. The "History" section answers at least part of your question: http://en.wikipedia.org/wiki/Newline#History

查看更多
淡お忘
4楼-- · 2019-01-09 03:11

It's interesting to note the CRLF is pretty much the internet standard. That is, pretty much every standard internet protocol that is line oriented uses CRLF. SMTP, POP, IMAP, NNTP, etc.. The body of email consists of lines terminated by CRLF.

查看更多
时光不老,我们不散
5楼-- · 2019-01-09 03:22

DOS inherited CR-LF line endings (what you're calling \r\n, just making the ascii characters explicit) from CP/M. CP/M inherited it from the various DEC operating systems which influenced CP/M designer Gary Kildall.

CR-LF was used so that the teletype machines would return the print head to the left margin (CR = carriage return), and then move to the next line (LF = line feed).

The Unix guys handled that in the device driver, and when necessary translated LF to CR-LF on output to devices that needed it.

And as you guessed, Mac OS X now uses LF.

查看更多
登录 后发表回答