Converting wide char string to lowercase in C++

How do I convert a wchar_t string from upper case to lower case in C++?

The string contains a mixture of Japanese, Chinese, German and Greek characters.

I thought about using towlower...

http://msdn.microsoft.com/en-us/library/8h19t214%28VS.80%29.aspx

.. but the documentation says that:

The case conversion of towlower is locale-specific. Only the characters relevant to the current locale are changed in case.

Edit: Maybe I should describe what I'm doing. I receive a Unicode search query from a user. It's originally in UTF-8 encoding, but I'm converting it to a widechar (I may be wrong on the wording). My debugger (VS2008) correctly shows the Japanese, German, etc characters in in the "variable quick watch". I need to go through another set of data in Unicode and find matches of the search string. While this is no problem for me to do when the search is case sensitive, it's more problematic to do it case insensitive. My (maybe naive) approach to solve the problem would be to convert all input data and output data to lower case and then compare it.

标签： c++ lowercase widestring

4条回答

Viruses.

2楼-- · 2019-02-17 08:50

You have a nasty problem in hand. A Japanese locale will not help converting German and vice versa. There are languages which do not have the concept of captalization either (toupper and friends would be a no-op here, I suppose). So, can you break up your string into individual chunks of words from the same language? If you can then you can convert the pieces and string them up.

0人赞添加讨论(0) 举报

倾城　Initia

3楼-- · 2019-02-17 08:57

This SO answer shows how to work with facets to work with several locales. If this is on Windows, you can consider using win32 API functions, if you can work with C++.NET (managed C++), you can use the char.ToLower and string.ToLower functions, which are Unicode compliant.

0人赞添加讨论(0) 举报

男人必须洒脱

4楼-- · 2019-02-17 08:59

Have a look at _wcslwr_l in <wchar.h> (MSDN).

You should be able to run the function on the input for each of the locales.

0人赞添加讨论(0) 举报

Juvenile、少年°

5楼-- · 2019-02-17 09:10

If your string contains all those characters, the codeset must be Unicode-based. If implemented properly, Unicode (Chapter 4 'Character Properties') defines character properties including whether the character is upper case and the lower case mapping, and so on.

Given that preamble, the towlower() function from <wctype.h> is the correct tool to use. If it doesn't do the job, you have a QoI (Quality of Implementation) problem to discuss with your vendor. If you find the vendor unresponsive, then look at alternative libraries. In this case, you might consider ICU (International Components for Unicode).

0人赞添加讨论(0) 举报

Converting wide char string to lowercase in C++

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间