How do I convert a string
to a byte[]
in .NET (C#) without manually specifying a specific encoding?
I'm going to encrypt the string. I can encrypt it without converting, but I'd still like to know why encoding comes to play here.
Also, why should encoding be taken into consideration? Can't I simply get what bytes the string has been stored in? Why is there a dependency on character encodings?
You need to take the encoding into account, because 1 character could be represented by 1 or more bytes (up to about 6), and different encodings will treat these bytes differently.
Joel has a posting on this:
Simply use this:
Try this, a lot less code:
If you really want a copy of the underlying bytes of a string, you can use a function like the one that follows. However, you shouldn't please read on to find out why.
This function will get you a copy of the bytes underlying your string, pretty quickly. You'll get those bytes in whatever way they are encoding on your system. This encoding is almost certainly UTF-16LE but that is an implementation detail you shouldn't have to care about.
It would be safer, simpler and more reliable to just call,
In all likelihood this will give the same result, is easier to type, and the bytes will always round-trip with a call to
Two ways:
And,
I tend to use the bottom one more often than the top, haven't benchmarked them for speed.
It depends on the encoding of your string (ASCII, UTF-8, ...).
For example:
A small sample why encoding matters:
ASCII simply isn't equipped to deal with special characters.
Internally, the .NET framework uses UTF-16 to represent strings, so if you simply want to get the exact bytes that .NET uses, use
System.Text.Encoding.Unicode.GetBytes (...)
.See Character Encoding in the .NET Framework (MSDN) for more information.