Upper vs Lower Case

When doing case-insensitive comparisons, is it more efficient to convert the string to upper case or lower case? Does it even matter?

It is suggested in this SO post that C# is more efficient with ToUpper because "Microsoft optimized it that way." But I've also read this argument that converting ToLower vs. ToUpper depends on what your strings contain more of, and that typically strings contain more lower case characters which makes ToLower more efficient.

In particular, I would like to know:

Is there a way to optimize ToUpper or ToLower such that one is faster than the other?
Is it faster to do a case-insensitive comparison between upper or lower case strings, and why?
Are there any programming environments (eg. C, C#, Python, whatever) where one case is clearly better than the other, and why?

标签： string language-agnostic uppercase

10条回答

无色无味的生活

2楼-- · 2019-01-01 05:39

From Microsoft on MSDN:

Best Practices for Using Strings in the .NET Framework

Recommendations for String Usage

Use the String.ToUpperInvariant method instead of the String.ToLowerInvariant method when you normalize strings for comparison.

Why? From Microsoft:

Normalize strings to uppercase

There is a small group of characters that when converted to lowercase cannot make a round trip.

What is example of such a character that cannot make a round trip?

Start: Greek Rho Symbol (U+03f1) ϱ
Uppercase: Capital Greek Rho (U+03a1) Ρ
Lowercase: Small Greek Rho (U+03c1) ρ

ϱ , Ρ , ρ

That is why, if your want to do case insensitive comparisons you convert the strings to uppercase, and not lowercase.

0人赞添加讨论(0) 举报

伤终究还是伤i

3楼-- · 2019-01-01 05:40

Converting to either upper case or lower case in order to do case-insensitive comparisons is incorrect due to "interesting" features of some cultures, particularly Turkey. Instead, use a StringComparer with the appropriate options.

MSDN has some great guidelines on string handling. You might also want to check that your code passes the Turkey test.

EDIT: Note Neil's comment around ordinal case-insensitive comparisons. This whole realm is pretty murky :(

0人赞添加讨论(0) 举报

梦寄多情

4楼-- · 2019-01-01 05:42

Microsoft has optimized ToUpperInvariant(), not ToUpper(). The difference is that invariant is more culture friendly. If you need to do case-insensitive comparisons on strings that may vary in culture, use Invariant, otherwise the performance of invariant conversion shouldn't matter.

I can't say whether ToUpper() or ToLower() is faster though. I've never tried it since I've never had a situation where performance mattered that much.

0人赞添加讨论(0) 举报

高级女魔头

5楼-- · 2019-01-01 05:42

It really shouldn't ever matter. With ASCII characters, it definitely doesn't matter - it's just a few comparisons and a bit flip for either direction. Unicode might be a little more complicated, since there are some characters that change case in weird ways, but there really shouldn't be any difference unless your text is full of those special characters.

0人赞添加讨论(0) 举报

零度萤火

6楼-- · 2019-01-01 05:42

It Depends. As stated above, plain only ASCII, its identical. In .NET, read about and use String.Compare its correct for the i18n stuff (languages cultures and unicode). If you know anything about likelyhood of the input, use the more common case.

Remember, if you are doing multiple string compares length is an excellent first discriminator.

0人赞添加讨论(0) 举报

无色无味的生活

7楼-- · 2019-01-01 05:49

If you are doing string comparison in C# it is significantly faster to use .Equals() instead of converting both strings to upper or lower case. Another big plus for using .Equals() is that more memory isn't allocated for the 2 new upper/lower case strings.

0人赞添加讨论(0) 举报

1 2 下一页

Upper vs Lower Case

Best Practices for Using Strings in the .NET Framework

Normalize strings to uppercase

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间