I have an XSD file that is encoded in UTF-8, and any text editor I run it through doesn't show any character at the beginning of the file, but when I pull it up in Visual Studio's debugger, I clearly see an empty box in front of the file.
I also get the error:
Data at the root level is invalid. Line 1, position 1.
Anyone know what this is?
Update: Edited post to qualify type of file. It's an XSD file created by Microsoft's XSD creator.
It turns out, the answer is that what I'm seeing is a Byte Order Mark, which is a character that tells whatever is loading the document what it is encoded in. In my case, it's encoded in utf-8, so the corresponding BOM was
EF BB BF
, as shown below. To remove it, I opened it up in Notepad++ and clicked on "Encode in UTF-8 without BOM", as shown below:.
To actually see the BOM, I had to open it up in TextPad in Binary mode:, and conducted a Google search for "
EF BB BF
".It took me about 8 hours to find out this was what was causing it, so I thought I'd share this with everyone.
Update: If I had read Joel Spolsky's blog post: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!), then I might not have had this problem.
here's how you do it with vim: