Why does + (plus) can concatenate two strings in P

2019-02-11 07:15发布

问题:

I'm learning Learn Python the hard way.

w = "This is the left side of..."
e = "a string with a right side."
print w + e

Explain why adding the two strings w and e with + makes a longer string.

Even I know it can work, but I don't understand why and how? Please help me.

回答1:

Python uses + to concatenate strings because that's how core developers of Python defined that operator.

While it's true that __add__ special method is normally used to implement the + operator, + (BINARY_ADD bytecode instruction) does not call str.__add__ because + treats strings specially in both Python 2 and Python 3. Python invokes the string concatenation function directly if both operands of + are strings, thus eliminating the need to call special methods.

Python 3 calls unicode_concatenate (source code):

TARGET(BINARY_ADD) {
    PyObject *right = POP();
    PyObject *left = TOP();
    PyObject *sum;
    if (PyUnicode_CheckExact(left) &&
             PyUnicode_CheckExact(right)) {
        sum = unicode_concatenate(left, right, f, next_instr);
        /* unicode_concatenate consumed the ref to v */
    }
    else {
        sum = PyNumber_Add(left, right);
        Py_DECREF(left);
    }
    ...

Python 2 calls string_concatenate (source code):

case BINARY_ADD:
    w = POP();
    v = TOP();
    if (PyInt_CheckExact(v) && PyInt_CheckExact(w)) {
        /* INLINE: int + int */
        register long a, b, i;
        a = PyInt_AS_LONG(v);
        b = PyInt_AS_LONG(w);
        /* cast to avoid undefined behaviour
           on overflow */
        i = (long)((unsigned long)a + b);
        if ((i^a) < 0 && (i^b) < 0)
            goto slow_add;
        x = PyInt_FromLong(i);
    }
    else if (PyString_CheckExact(v) &&
             PyString_CheckExact(w)) {
        x = string_concatenate(v, w, f, next_instr);
        /* string_concatenate consumed the ref to v */
        goto skip_decref_vx;
    }
    else {
      slow_add:
        x = PyNumber_Add(v, w);

    ...

This optimization has been in Python ever since 2004. From the issue980695:

... in the attached patch ceval.c special-cases addition of two strings (in the same way as it special-cases addition of two integers already)

But note that the main goal was greater than elimination of special attribute lookup.


For what it's worth, str.__add__ still works as expected:

>>> w.__add__(e)
'This is the left side of...a string with a right side.'

and Python will call __add__ methods of subclasses of str, because PyUnicode_CheckExact(left) && PyUnicode_CheckExact(right) (or PyString_CheckExact(v) && PyString_CheckExact(w), in Python 2) from the code snippets above will be false:

>>> class STR(str):
...     def __add__(self, other):
...         print('calling __add__')
...         return super().__add__(other)
... 
>>> STR('abc') + STR('def')
calling __add__
'abcdef'


回答2:

The __add__ method of the class str is called when you add (concatenate) two strings in this manner. The __add__ method works as such, but the following is not verbatim from the source code:

def __add__(self, str1, str2):
    str3 = [str1] #Convert to list
    str3.append(str2) #Add the second string to the list
    return ''.join(str3) #Return the two joined

Example:

class MyString:
    def __init__(self, initstr):
        self.string = initstr
    def __add__(str1, str2):
        str3 = [str1.string] #Convert to list
        str3.append(str2.string) #Add the second string to the list
        return MyString(''.join(str3)) #Return the two joined

>>> a = MyString("Hello")
>>> b = MyString(" World!")
>>> c = a+b
>>> c.string
'Hello World!'
>>> 


回答3:

+, -, *, and other operators will work on anything that implements the right methods. Since strings implement the __add__(self, other) method, you can add two strings with +.

Try this: define your own string subclass and override its __add__ method:

class BadString(str):
    def __add__(self, other):
        # Ignore both input strings and just return this:
        return "Nothing Useful"

s = BadString("hello")

print("hello" + " world")    # "hello world"
print(s + "world")           # "Nothing Useful"

The same technique, operator overloading, lets you create classes that can use the built-in operators like + or * usefully, like vectors that can be added together.