Python string interning

While this question doesn't have any real use in practise, I am curious as to how Python does string interning. I have noticed the following.

>> "string" is "string"
>> True

This is as I expected.

You can also do this.

>> "strin"+"g" is "string"
>> True

And that's pretty clever!

But you can't do this.

>> s1 = "strin"
>> s2 = "string"
>> s1+"g" is s2
>> False

Why wouldn't Python evaluate s1+"g", realise it is the same as s1 and point it to the same address? What is actually going on in that last block to have it return False?

标签： python string internals

2条回答

荒废的爱情

2楼-- · 2018-12-31 06:14

This is implementation-specific, but your interpreter is probably interning compile-time constants but not the results of run-time expressions.

In what follows I use CPython 2.7.3.

In the second example, the expression "strin"+"g" is evaluated at compile time, and is replaced with "string". This makes the first two examples behave the same.

If we examine the bytecodes, we'll see that they are exactly the same:

  # s1 = "string"
  2           0 LOAD_CONST               1 ('string')
              3 STORE_FAST               0 (s1)

  # s2 = "strin" + "g"
  3           6 LOAD_CONST               4 ('string')
              9 STORE_FAST               1 (s2)

The third example involves a run-time concatenation, the result of which is not automatically interned:

  # s3a = "strin"
  # s3 = s3a + "g"
  4          12 LOAD_CONST               2 ('strin')
             15 STORE_FAST               2 (s3a)

  5          18 LOAD_FAST                2 (s3a)
             21 LOAD_CONST               3 ('g')
             24 BINARY_ADD          
             25 STORE_FAST               3 (s3)
             28 LOAD_CONST               0 (None)
             31 RETURN_VALUE

If you were to manually intern() the result of the third expression, you'd get the same object as before:

>>> s3a = "strin"
>>> s3 = s3a + "g"
>>> s3 is "string"
False
>>> intern(s3) is "string"
True

0人赞添加讨论(0) 举报

君临天下

3楼-- · 2018-12-31 06:20

Case 1

>>> x = "123"  
>>> y = "123"  
>>> x == y  
True  
>>> x is y  
True  
>>> id(x)  
50986112  
>>> id(y)  
50986112

Case 2

>>> x = "12"
>>> y = "123"
>>> x = x + "3"
>>> x is y
False
>>> x == y
True

Now, your question is why the id is same in case 1 and not in case 2.
In case 1, you have assigned a string literal "123" to x and y.

Since string are immutable, it makes sense for the interpreter to store the string literal only once and point all the variables to the same object.
Hence you see the id as identical.

In case 2, you are modifying x using concatenation. Both x and y has same values, but not same identity.
Both points to different objects in memory. Hence they have different id and is operator returned False

0人赞添加讨论(0) 举报

Python string interning

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间