Limitations of PyTuple_SetItem

2020-08-13 05:59发布

问题:

I have a Python extension module which creates a tuple as an attribute of another object, and sets items in the tuple. Whenever I execute this module in Python, I keep getting the error SystemError: bad argument to internal function

After reading over the docs for PyTuple, and debugging my program for a few hours, I still couldn't figure out what the hell was going on. Running my program through a debugger indicated the problem was occurring within a library call inside the Python interpreter. So, finally, I looked at the Python source code, and at long last I realized the problem. The PyTuple_SetItem function has an interesting restriction which I didn't know about, and can't find explicitly documented.

Here is the important function in the Python source (edited for clarity):

int PyTuple_SetItem(register PyObject *op, register Py_ssize_t i, PyObject *newitem)
{
    .....
    if (!PyTuple_Check(op) || op->ob_refcnt != 1) {
        Py_XDECREF(newitem);
        PyErr_BadInternalCall();
        return -1;
    }
    .....
}

The important line here is the condition op->ob_refcnt != 1. So here's the problem: you can't even call PyTuple_SetItem unless the Tuple has a ref-count of 1. It looks like the idea here is that you're never supposed to use PyTuple_SetItem except right after you create a tuple using PyTuple_New(). I guess this makes sense, since Tuples are supposed to be immutable, after all, so this restriction helps keep your C code more in line with the abstractions of the Python type system.

However, I can't find this restriction documented anywhere. Relevant docs seem to be here and here, neither of which specify this restriction. The docs basically say that when you call PyTuple_New(X), all items in the tuple are initialized to NULL. Since NULL is not a valid Python value, it's up to the extension-module programmer to make sure that all slots in the Tuple are filled in with proper Python values before returning the Tuple to the interpreter. But it doesn't say anywhere that this must be done while the Tuple object has a ref count of 1.

So now, the problem is that I've basically coded myself into a corner because I wasn't aware of this (undocumented?) restriction on PyTuple_SetItem. My code is structured in such a way that it's very inconvenient to insert items into the tuple until after the tuple itself has become an attribute of another object. So, when it comes time to fill in the items in the tuple, the tuple already has a higher ref count.

I'll probably have to restructure my code, but I was seriously considering just temporarily setting the ref count on the Tuple to 1, inserting the items, and then restoring the original ref count. Of course, that's a horrible hack, I know, and not any kind of permanent solution. Regardless, I'd like to know if the requirement regarding the ref count on the Tuple is documented anywhere. Is it just an implementation detail of CPython, or is it something that API users can rely on as expected behavior?

回答1:

I'm quite sure that you can get around the restrictions by using PyTuple_SET_ITEM instead of PyTuple_SetItem. PyTuple_SET_ITEM is a macro defined in tupleobject.h as follows:

#define PyTuple_SET_ITEM(op, i, v) (((PyTupleObject*)(op))->ob_item[i] = v

So, if you are absolutely, definitely and utterly sure that:

  1. op is a tuple object
  2. you haven't initialized slot i in the tuple so far
  3. you own a reference to v and you want to let the tuple steal it and
  4. there is no chance of another Python object using the tuple for anything before you call PyTuple_SET_ITEM

then I guess you are safe to use PyTuple_SET_ITEM.



回答2:

The Python C API is very underdocumented and I would not be surprised if this restriction wasn't mentioned anywhere.

Of course, you should never be modifying tuples once something has gotten a hold of them regardless; either pass in the elements you need to put in the tuple, or use a list instead.