What's the main difference between the two?
Still both of them are for writing Strings.
public void writeUTF(String str)
throws IOException
Primitive data write of this String in modified UTF-8 format.
vs
public void writeBytes(String str)
throws IOException
Writes a String as a sequence of bytes.
When should I use one rather than the other?
It's in the documentation... from DataOutput.writeBytes(String)
:
Writes a string to the output stream. For every character in the string s, taken in order, one byte is written to the output stream. If s is null, a NullPointerException is thrown.
If s.length is zero, then no bytes are written. Otherwise, the character s[0] is written first, then s1, and so on; the last character written is s[s.length-1]. For each character, one byte is written, the low-order byte, in exactly the manner of the writeByte method . The high-order eight bits of each character in the string are ignored.
In other words, "Sod Unicode, we don't care about any characters not in ISO-8859-1. Oh, and we assume you don't care about the length of the string either."
Note that writeBytes
doesn't even try to detect data corruption - if you write out a character which isn't in ISO-8859-1, it will just drop the high byte silently.
Just say no - writeUTF
is your friend... assuming your string is less than 64K in length.
Of course, if you have a protocol you're trying to implement which itself requires a single-byte encoding (ISO-8859-1 or ASCII) and doesn't use a length, then writeBytes
might be appropriate - but I'd personally probably perform the text-to-bytes conversion myself and then use write(byte[])
instead... it's clearer.
If there's a possibility that your String
is holding something that uses wide characters (basically anything beyond standard ASCII), use UTF. If your output is going to something that requires a one-byte-per-character encoding, such as header labels in many network protocols, use bytes.
when data is stored using UTF it stores in Universal Character Set, so when you string data contains other than ASCII character use writeUTF, otherwise writeByte is ok.
In adition, writeUTF has a maximun length of 65535 bytes (the length of the byte array depends on each character of the String).
If the UTF representation of your String is larger than that, you must use the conversion from your own and use the write(byte[])
as Jon said.
You should use readUTF()
if and only if:
- You are using
writeUTF()
at the other end, and
- You can live with the 64k restriction.