Small open source Unicode library for C/C++

2020-05-23 09:48发布

问题:

Does anyone know of a great small open source Unicode handling library for C or C++? I've looked at ICU, but it seems way too big.

I need the library to support:

  • all the normal encodings
  • normalization
  • finding character types - finding if a character should be allowed in identifiers and comments
  • validation - recognizing nonsense

回答1:

Well, iconv is a good starting point at least.

Also, a google search returns another stackoverflow question! The horror! SO: Light C unicode library



回答2:

UTF8-CPP was recommended in the accepted answer to a similar question: Portable and simple unicode string library for C/C++?



回答3:

I looked at UT8-CPP, and libiconv, and neither seemed to have all the features I needed. So, I guess I'll just use ICU, even though it is really big. I think there are some ways to strip out the unneeded functions and data, so I'll try that. This page (under "Customizing ICU's Data Library") describes how to cut out some of the data.



回答4:

How many features do you really need? In many cases I find converting to one type internally (e.g. UTF8) and handling the various encodings only when loading or saving is more than sufficient. If you are willing to spend a little time and write a class to handle that I'm sure you will reuse it again and again.

I have one lying around somewhere, but iirc the UTF32LE/BE is untested: http://aaq.cc/d

If your project really does need to handle various encodings other than to load/save files then you are probably best off with a library though...



标签: c++ c unicode