可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
As far as know in the end of all files, specially text files, there is a Hex code for EOF or NULL character. And when we want to write a program and read the contents of a text file, we send the read function until we receive that EOF hexcode.
My question : I downloaded some tools to see a hex view of a text file. but I can't see any hex code for EOF(End Of File/NULL) or EOT(End Of Text)
ASCII/Hex code tables :
This is output of Hex viewer tools:
Note : My input file is a text file that its content is "Where is hex code of "EOF"?"
Appreciate your time and consideration.
回答1:
There is no such thing as a EOF character. The operating system knows exactly how many bytes a file contains (this is stored alongside other metadata like permissions, creation date, and the name), and hence can tell programs that try to read the eleventh byte of a ten byte file: You've reached the end of file, there are no more bytes to read.
In fact, the "EOF" value returned for example by C functions like getchar
is explicitly an int
value outside the range of a byte, so it cannot possibly be stored in a file!
Sometimes, certain file formats insist on adding NUL terminators (probably because that's how strings are usually stored in C), though usually these delimit multiple records in a single file, not the file as a whole. And such decoration usually disqualifies a file from being considered a "text file".
ASCII codes like ETX and NUL date back to the days of teletypewriters and friends. NUL is used in C for in-memory strings, but this has no bearing on file systems.
回答2:
There was - a long long time ago - an End Of File marker but it hasn't been used in files for many years.
You can demonstrate a distant echo of it on windows using:
C:\>copy con junk.txt
Hello
Hello again
- Press <Ctrl> and <z>
C:\>dump junk.txt
junk.txt:
00000000 4865 6c6c 6f0d 0a48 656c 6c6f 2061 6761 Hello..Hello aga
00000010 696e 0d0a in..
C:\>
Note the use of Ctrl-Z
as an EOT marker.
However, notice also that the Ctrl-Z
does not appear in the file any more - it used to appear as a 0x1a
but only on some operating systems and even then not consistently.
Use of ETX
(0x03
) stopped even before those dim and distant times.
回答3:
There is no such thing as EOF. EOF is just a value returned by file reading functions to tell you the file pointer reached the end of the file.
回答4:
There once were even different EOF characters (for different operating systems). No longer seen one. (Typically files were in blocks of 128 bytes.) For coding a PITA, like nowadays BOMs.
Instead there is still a int read()
that normally delivers a byte value, but for EOF delivers -1.
The NUL character is a string terminator in C. In java you can have a NUL character in the middle of a string. To be cooperative with C, the UTF-8 bytes generated use a multi-byte encoding both for Unicode characters > 127 and for NUL.
(Some of this is probably known already.)
回答5:
The EOT
byte (0x04
) is used to this day by unix tty terminals to indicate end of input. You type it with a Ctrl + D (ie. ^D
) to end input to shells or any other program reading from stdin.
However, as others have pointed out, this is distinct from EOF, which is a condition rather than a piece of data per se.
回答6:
You need the end of file character in certain instances for example sending a file to a printer from a Unix computer. Most windows/dos enabled printers expect the end-of-file marker to print the file stored in their memories. If no end-of-file marker is sent, the printer just sits until it times out (normally 2 minutes) and then prints the file. If you use lpr to print from Unix, you should make sure to include the end-of-file marker. Windows/dos attach it automatically and the printers are designed to wait fot it.