Implementation and performance difference between

2019-02-20 19:21发布

问题:

What is the difference between inserting an element in a python list in the following ways?

myList.insert(at, myValue)
myList[at:at] = [myValue]

I have run some tests and the performance of the two are very similar, but the slicing insert consistently produces slightly better results. My question is regarding the difference in implementation and performance, not the behaviour.

回答1:

We have the same behaviour, see bellow:

The default behaviour is to insert the item at the given index; each value at greater index are shifted one position to the end.

>>> my_list = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
>>> my_list.insert(5, 'item')
>>> my_list
['a', 'b', 'c', 'd', 'e', 'item', 'f', 'g']

>>> my_list = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
>>> my_list.insert(-3, 'item')
>>> my_list
['a', 'b', 'c', 'd', 'item', 'e', 'f', 'g']

If the list is empty, the item is appended normally.

>>> my_list = []
>>> my_list.insert(5, 'item')
>>> my_list
['item']

>>> my_list = []
>>> my_list.insert(-3, 'item')
>>> my_list
['item']

If the index is out of bounds, the item is appended to the end if the index is positive or to the beginning if negative. No exception is raised.

>>> my_list = ['a', 'b']
>>> my_list.insert(5, 'item')
>>> my_list
['a', 'b', 'item']

>>> my_list = ['a', 'b']
>>> my_list.insert(-3, 'item')
>>> my_list
['item', 'a', 'b']

We have exactly the same behaviour with slice notation, in the case of a range of same indexes:

>>> my_list = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
>>> my_list[5:5] = ['item']
>>> my_list
['a', 'b', 'c', 'd', 'e', 'item', 'f', 'g']

>>> my_list = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
>>> my_list[-3:-3] = ['item']
>>> my_list
['a', 'b', 'c', 'd', 'item', 'e', 'f', 'g']

>>> my_list = []
>>> my_list[5:5] = ['item']
>>> my_list
['item']

>>> my_list = []
>>> my_list[-3:-3] = ['item']
>>> my_list
['item']

>>> my_list = ['a', 'b']
>>> my_list[5:5] = ['item']
>>> my_list
['a', 'b', 'item']

>>> my_list = ['a', 'b']
>>> my_list[-3:-3] = ['item']
>>> my_list
['item', 'a', 'b']

The slice notation is the same as calling __setitem__() method with a slice object:

>>> my_list = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
>>> my_list.__setitem__(slice(5, 5), ['item'])
>>> my_list
['a', 'b', 'c', 'd', 'e', 'item', 'f', 'g']


回答2:

Implementation details

According to the CPython implementation available on GitHub at https://github.com/python/cpython/blob/master/Objects/listobject.c and https://github.com/python/cpython/blob/master/Objects/listobject.c, we have:

The insert() method is defined in the following function:

static PyObject *
listinsert(PyListObject *self, PyObject *args)
{
    Py_ssize_t i;
    PyObject *v;
    if (!PyArg_ParseTuple(args, "nO:insert", &i, &v))
        return NULL;
    if (ins1(self, i, v) == 0)
        Py_RETURN_NONE;
    return NULL;
}

Which calls ins1() function, here is the C code:

static int
ins1(PyListObject *self, Py_ssize_t where, PyObject *v)
{
    Py_ssize_t i, n = Py_SIZE(self);
    PyObject **items;
    if (v == NULL) {
        PyErr_BadInternalCall();
        return -1;
    }
    if (n == PY_SSIZE_T_MAX) {
        PyErr_SetString(PyExc_OverflowError,
            "cannot add more objects to list");
        return -1;
    }

    if (list_resize(self, n+1) < 0)
        return -1;

    if (where < 0) {
        where += n;
        if (where < 0)
            where = 0;
    }
    if (where > n)
        where = n;
    items = self->ob_item;
    for (i = n; --i >= where; )
        items[i+1] = items[i];
    Py_INCREF(v);
    items[where] = v;
    return 0;
}

The slice call is done by the PyList_SetSlice() function:

int
PyList_SetSlice(PyObject *a, Py_ssize_t ilow, Py_ssize_t ihigh, PyObject *v)
{
    if (!PyList_Check(a)) {
        PyErr_BadInternalCall();
        return -1;
    }
    return list_ass_slice((PyListObject *)a, ilow, ihigh, v);
}

The optimised implementation is done in:

static int
list_ass_slice(PyListObject *a, Py_ssize_t ilow, Py_ssize_t ihigh, PyObject *v)

The insertion is done in the following piece of code:

else if (d > 0) { /* Insert d items */
    k = Py_SIZE(a);
    if (list_resize(a, k+d) < 0)
        goto Error;
    item = a->ob_item;
    memmove(&item[ihigh+d], &item[ihigh],
        (k - ihigh)*sizeof(PyObject *));
}
for (k = 0; k < n; k++, ilow++) {
    PyObject *w = vitem[k];
    Py_XINCREF(w);
    item[ilow] = w;
}

Hope it helps!