Is list join really faster than string concatenati

2019-01-28 05:28发布

I find that string concatenation seems to have less python bytecode than list join.

This is an example.

test.py:

a = ''.join(['a', 'b', 'c'])
b = 'a' + 'b' + 'c'

Then I execute python -m dis test.py. I got the following python bytecode (python 2.7):

  1           0 LOAD_CONST               0 ('')
              3 LOAD_ATTR                0 (join)
              6 LOAD_CONST               1 ('a')
              9 LOAD_CONST               2 ('b')
             12 LOAD_CONST               3 ('c')
             15 BUILD_LIST               3
             18 CALL_FUNCTION            1
             21 STORE_NAME               1 (a)

  3          24 LOAD_CONST               6 ('abc')
             27 STORE_NAME               2 (b)
             30 LOAD_CONST               4 (None)
             33 RETURN_VALUE  

Obviously, the bytecode number of string concatenation is less.It just load string 'abc' directly.

Can anyone explain why we always say that list join is much better?

3条回答
混吃等死
2楼-- · 2019-01-28 06:13

From Efficient String Concatenation in Python

Method 1 : 'a' + 'b' + 'c'

Method 6 : a = ''.join(['a', 'b', 'c'])

20,000 integers were concatenated into a string 86kb long :

pic

                Concatenations per second     Process size (kB)
  Method 1               3770                    2424
  Method 6               119,800                 3000

Conclusion : YES, str.join() is significantly faster then typical concatenation (str1+str2).

查看更多
男人必须洒脱
3楼-- · 2019-01-28 06:19

Don't believe it! Always get proof!

Source: I stared at python source code for an hour and calculated complexities!

My findings.

For 2 strings. (Assume n is the length of both strings)

Concat (+) - O(n)
Join - O(n+k) effectively O(n)
Format - O(2n+k) effectively O(n)

For more than 2 strings. (Assume n is the length of all strings)

Concat (+) - O(n^2)
Join - O(n+k) effectively O(n)
Format - O(2n+k) effectively O(n)

RESULT:

If you have two strings technically concatenation (+) is better, effectively though it is exactly the same as join and format.

If you have more than two strings concat becomes awful and join and format are effectively the same though technically join is a bit better.

SUMMARY:

If you don't care for efficiency use any of the above. (Though since you asked the question I would assume you care)

Therefore -

If you have 2 strings use concat (when not in a loop!) If you have more than two strings (all strings) (or in a loop) use join If you have anything not strings use format, because duh.

Hope this helps!

查看更多
ら.Afraid
4楼-- · 2019-01-28 06:23

Because

''.join(my_list)

is much better than

my_list[0] + my_list[1]

and better than

my_list[0] + my_list[1] + my_list[2]

and better than

my_list[0] + my_list[1] + my_list[2] + my_list[3]

and better…

In short:

print 'better than'
print ' + '.join('my_list[{}]'.format(i) for i in xrange(x))

for any x.

查看更多
登录 后发表回答