I am trying to find the implementation of the built-in in
operator in the (C) Python source code. I have searched in the built-in functions source code, bltinmodule.c, but cannot find the implementation of this operator. Where can I find this implementation?
My goal is to improve the sub-string search in Python by extending different C implementations of this search, although I am not sure if Python already uses the idea I have.
To find the implementation of any python operator, first find out what bytecode Python generates for it, using the dis.dis
function:
>>> dis.dis("'0' in ()")
1 0 LOAD_CONST 0 ('0')
2 LOAD_CONST 1 (())
4 COMPARE_OP 6 (in)
6 RETURN_VALUE
The in
operator becomes a COMPARE_OP
byte code. Now you can trace how this opcode is being handled in the Python evaluation loop in Python/ceval.c
:
TARGET(COMPARE_OP)
PyObject *right = POP();
PyObject *left = TOP();
PyObject *res = cmp_outcome(oparg, left, right);
Py_DECREF(left);
Py_DECREF(right);
SET_TOP(res);
if (res == NULL)
goto error;
PREDICT(POP_JUMP_IF_FALSE);
PREDICT(POP_JUMP_IF_TRUE);
DISPATCH();
cmp_outcome()
is defined in the same file, and the in
operator is one of the switches:
case PyCmp_IN:
res = PySequence_Contains(w, v);
if (res < 0)
return NULL;
break;
A quick grep shows us where PySequence_Contains
is defined, in Objects/abstract.c:
int
PySequence_Contains(PyObject *seq, PyObject *ob)
{
Py_ssize_t result;
PySequenceMethods *sqm = seq->ob_type->tp_as_sequence;
if (sqm != NULL && sqm->sq_contains != NULL)
return (*sqm->sq_contains)(seq, ob);
result = _PySequence_IterSearch(seq, ob, PY_ITERSEARCH_CONTAINS);
return Py_SAFE_DOWNCAST(result, Py_ssize_t, int);
}
PySequence_Contains
thus uses the sq_contains
slot on the Sequence object structure or an iterative search otherwise, for Python C objects.
For Python 3 Unicode string objects, this slot is implemented as PyUnicode_Contains
in Objects/unicodeobject.c, in Python 2 you also want to check out string_contains
in Objects/stringobject.c. Basically just grep for sq_contains
in the Objects/ subdirectory for the various implementations by the different Python types.
For generic python objects, it's interesting to note that Objects/typeobject.c defers this to the __contains__
method on custom classes, if so defined.