Small open source Unicode library for C/C++

2020-05-23 09:52发布

Does anyone know of a great small open source Unicode handling library for C or C++? I've looked at ICU, but it seems way too big.

I need the library to support:

  • all the normal encodings
  • normalization
  • finding character types - finding if a character should be allowed in identifiers and comments
  • validation - recognizing nonsense

标签: c++ c unicode
4条回答
我欲成王,谁敢阻挡
2楼-- · 2020-05-23 10:42

UTF8-CPP was recommended in the accepted answer to a similar question: Portable and simple unicode string library for C/C++?

查看更多
走好不送
3楼-- · 2020-05-23 10:43

Well, iconv is a good starting point at least.

Also, a google search returns another stackoverflow question! The horror! SO: Light C unicode library

查看更多
Explosion°爆炸
4楼-- · 2020-05-23 10:43

How many features do you really need? In many cases I find converting to one type internally (e.g. UTF8) and handling the various encodings only when loading or saving is more than sufficient. If you are willing to spend a little time and write a class to handle that I'm sure you will reuse it again and again.

I have one lying around somewhere, but iirc the UTF32LE/BE is untested: http://aaq.cc/d

If your project really does need to handle various encodings other than to load/save files then you are probably best off with a library though...

查看更多
冷血范
5楼-- · 2020-05-23 10:50

I looked at UT8-CPP, and libiconv, and neither seemed to have all the features I needed. So, I guess I'll just use ICU, even though it is really big. I think there are some ways to strip out the unneeded functions and data, so I'll try that. This page (under "Customizing ICU's Data Library") describes how to cut out some of the data.

查看更多
登录 后发表回答