Strings joining and complexity?

2019-03-30 09:05发布

When I need to join two strings I use String.Format (or StringBuilder if it happens in several places in the code).

I see that some good programmers doesn't give attention to strings joining complexity and just use the '+' operator.

I know that using the '+' operator make the application to use more memory, but what about complexity?

12条回答
萌系小妹纸
2楼-- · 2019-03-30 09:28

I benchmarked this forever ago, and it hasn't really made a differences since .NET 1.0 or 1.1.

Back then if you had some process that was going to hit a line of code that was concatinating strings a few million times you could get a huge speed increase by using String.Concat, String.Format, or StringBuilder.

Now it doesn't matter at all. At least it hasn't mattered since .Net 2.0 came out anyhow. Put it out of your mind and code in whatever manner makes it easiest for you to read.

查看更多
在下西门庆
3楼-- · 2019-03-30 09:35

This is an excellent article about the different string join methods by our own Jeff Atwood on Coding Horror:

alt text http://www.codinghorror.com/blog/images/coding-horror-official-logo-small.png

The Sad Tragedy of Micro-Optimization Theater

Here is the gist of the post.

[several string join methods shown]

Take your itchy little trigger finger off that compile key and think about this for a minute. Which one of these methods will be faster?

Got an answer? Great!

And.. drumroll please.. the correct answer:

It. Just. Doesn't. Matter!

查看更多
等我变得足够好
4楼-- · 2019-03-30 09:40

Because strings are immutable in languages like Java and C#, everytime two strings are concatenated a new string has to be created, in which the contents of the two old strings are copied.

Assume strings which are on average c characters long.

Now the first concatenation only has to copy 2*c characters, but the last one has to copy the concatenation of the first n-1 strings, which is (n-1)*c characters long, and the last one itself, which is c characters long, for a total of n*c characters. For n concatenations this makes n^2*c/2 character copies, which means an algorithmic complexity of O(n^2).

In most cases in practice however this quadratic complexity will not be noticeable (as Jeff Atwood shows in the blog entry linked to by Robert C. Cartaino) and I'd advise to just write the code as readable as possible.

There are cases however when it does matter, and using O(n^2) in such cases may be deadly.

In practice I've seen this for example for generating big Word XML files in memory, including base64 encoded pictures. This generation used to take over 10 minutes due to using O(n^2) string concatenation. After I replaced concatenation using + with StringBuilder the running time for the same document reduced below 10 seconds.

Similarly I've seen a piece of software that generated an epically big piece of SQL code as a string using + for concatenation. I haven't even waited till this finished (had been waiting for over an hour already), but just rewrote it using StringBuilder. This faster version finished within a minute.

In short, just do whatever is most readable / easiest to write and only think about this when you'll be creating a freaking huge string :-)

查看更多
叛逆
5楼-- · 2019-03-30 09:43

Plenty of input already, but I've always felt that the best way to approach the issue of performance is to understand the performance differences of all viable solutions and for those that meet performance requirements, pick the one that is the most reliable and the most supportable.

There are many who use Big O notation to understand complexity, but I've found that in most cases (including understanding which string concatenation methods work best), a simple time trial will suffice. Just compare strA+strB to strA.Append(strB) in a loop of 100,000 iterations to see which works faster.

查看更多
霸刀☆藐视天下
6楼-- · 2019-03-30 09:45

Depends on the situation. The + can sometimes reduce the complexity of the code. Consider the following code:

output = "<p>" + intro + "</p>";

That is a good, clear line. No String.Format required.

查看更多
再贱就再见
7楼-- · 2019-03-30 09:48

If you use + only once, you have no disadvantage from it and it increases readability (as Colin Pickard already stated).

As far as I know + means: take left operand and right operand and copy them into a new buffer (as strings are immutable).

So using + two times (as in Colin Pickards example you already create 2 temporary strings. First when adding "<p>" to intro and then when adding "</p>" to the newly created string.

You have to consider for yourself when to use which method. Even for a small example like seen above performance drop can be serious if intro is a large enough string.

查看更多
登录 后发表回答