可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I'm running into an issue where I'm processing unicode strings and I want to do some error reporting with standard exceptions. The error messages contained in standard exceptions are not unicode.

Usually that hasn't been a problem for me because I can define the error message in non-unicode and have enough information, but in this case I want to include data from the original strings, and these can be unicode.

How do you handle unicode messages in your exceptions? Do you create your own custom exception class, do you derive from the standard exceptions extending them to unicode, or do you have even other solutions to this problem (such as a rule "don't use unicode in exceptions")?

回答1:

I think Peter Dimov's rationale as pointed out in the Boost error handling guidelines covers this well:

Don't worry too much about the what() message. It's nice to have a message that a programmer stands a chance of figuring out, but you're very unlikely to be able to compose a relevant and user-comprehensible error message at the point an exception is thrown. Certainly, internationalization is beyond the scope of the exception class author. Peter Dimov makes an excellent argument that the proper use of a what() string is to serve as a key into a table of error message formatters. Now if only we could get standardized what() strings for exceptions thrown by the standard library...

回答2:

(I'm adding an answer to my own question after an insight because of Flodin's answer)

In my particular case I have a string which may contain unicode characters, which I am parsing and thus expecting to be in a certain format. The parsing may fail and throw an exception to indicate that a problem occurred.
Originally I intended to create a programmer-readable message inside the exception that details the contents of the string where parsing failed, and that's where I ran into trouble because the exception message of a standard exception cannot contain unicode characters.

However, the new design I am considering is to return the location of the parsing error in the string through the exception mechanism within a std::exception-derived class. The process of creating a programmer-readable message that contains the parts of the string causing the error can be delegated to a handler outside the class. This feels like a much cleaner design to me.

Thank you for the input, everyone!

回答3:

If you really want Unicode you could UTF-8 encode the exception message, throw in a BOM in the beginning so you can tell if the exception message is UTF-8, raw char, or other encoding when you prepare the message for output.

回答4:

We use our own exception class. If that's not possible you can always translate from Unicode to MBSC represented in the current charset – you usually need this text only for a short while and further conversion is not a question.

回答5:

I would suggest deriving from std::exception and extend it to use your unicode string class. Deriving from std::exception gives you the benefit of doing a:

catch (std::exception&)...

as your last catch and have it catch any exception you might have thrown (and STL). Where as if you create your own base exception (and have your other exception derive from that) you would need to add another catch.

Either way I don't think it really matters but I prefer this style (obviously this wastes an empty std::string from std::exception but I don't think it'll make a big difference).