I have been given the task of generating all the characters in the UTF-8 character set to test how a system handles each of them. I do not have much experience with character encoding. The approaching I was going to try was to increment a counter, and then try to translate that base ten number into it's equivalent UTF-8 character, but so far I have no been able to find an effective way to to this in C# 3.5
Any suggestions would be greatly appreciated.
The above code should iterate over the currently assigned Unicode characters. You'll probably want to parse the UnicodeData file locally and fix any C# blunders I've made.
The set of currently assigned Unicode characters is less than the set that could be defined. Of course, whether you see a character when you print one of them out depends on a great many other factors, like fonts and the other applications it'll pass through before it is emitted to your eyeball.
This code will produce the output in a file. All characters printable or not will be in there.
This will give you all the characters in a charset - just make sure you specify a charset when specifying the Encoding: