I load Python dynamically with dlopen
and RTLD_LOCAL
to avoid collisions with another library which by coincidence contains a few symbols with the same name. Executing my MVCE above on macOS with Xcode fails because it expects _PyBuffer_Type
in the global namespace.
Traceback (most recent call last):
File "...lib/python2.7/ctypes/__init__.py", line 10, in <module>
from _ctypes import Union, Structure, Array
ImportError: dlopen(...lib/python2.7/lib-dynload/_ctypes.so, 2):
Symbol not found: _PyBuffer_Type
Referenced from: ...lib/python2.7/lib-dynload/_ctypes.so
Expected in: flat namespace
in ...lib/python2.7/lib-dynload/_ctypes.so
Program ended with exit code: 255
But why? Does RTLD_LOCAL
overwrite the two-level namespace?
I used otool -hV
to check that _ctypes.so was compiled with the Two-Level namespace option. From my understanding the symbol resolve needs the library name + the symbol name itself. Why does it expect _PyBuffer_Type
in the flat namespace and/or why can't it find it? See TWOLEVEL
by scrolling to the right
> otool -hV /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/_ctypes.so
Mach header
magic cputype cpusubtype caps filetype ncmds sizeofcmds flags
MH_MAGIC_64 X86_64 ALL 0x00 BUNDLE 14 1536 NOUNDEFS DYLDLINK TWOLEVEL
Any idea whats going on here?
MVCE
Can be copied to a new Xcode project, simply compile and execute.
#include </System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7/Python.h>
#include <dlfcn.h>
int main(int argc, const char * argv[])
{
auto* dl = dlopen("/System/Library/Frameworks/Python.framework/Versions/2.7/Python", RTLD_LOCAL | RTLD_NOW);
if (dl == nullptr)
return 0;
// Load is just a macro to hide dlsym(..)
#define Load(name) ((decltype(::name)*)dlsym(dl, # name))
Load(Py_SetPythonHome)("/System/Library/Frameworks/Python.framework/Versions/2.7");
Load(Py_Initialize)();
auto* readline = Load(PyImport_ImportModule)("ctypes");
if (readline == nullptr)
{
Load(PyErr_Print)();
dlclose(dl);
return -1;
}
Py_DECREF(readline);
Load(Py_Finalize)();
return 0;
}
This question and your related RTLD_GLOBAL question both concern the semantics of the dynamic loader resolving undefined symbols in the shared libraries that it loads. I was hoping to find an explicit documentation reference that would explain what you're seeing, but I've not been able to do it. Nonetheless, I can make an observation that may explain what's happening.
If we run with verbosity, we can see that the python library is attempting to load two shared libraries before it fails:
Given that the first one succeeds, we know that generally the dynamic loader is resolving the undefined symbols against the namespace of the calling library. And in fact, as you note in the comments of your other question, this even works when there are two versions of the python library, i.e. the
dlopen()
s done by the python libraries resolve against their respective namespaces. So far, this sounds like exactly what you want. But, why is_ctypes.so
failing to load?We know that
_PyModule_GetDict
is the symbol that was causing_locale.so
to fail to load in your other question; and that it obviously works here. We also know that the symbol_PyBuffer_Type
is failing here. What's the difference between these two symbols? Looking them up in the python library:_PyModule_GetDict
is aText
(code) symbol, whereas_PyBuffer_Type
is aData
symbol.Therefore, based on this empirical data, I suspect the dynamic loader will resolve undefined symbols against
RTLD_LOCAL
code symbols of the calling library, but notRTLD_LOCAL
data symbols. Perhaps somebody can point to an explicit reference.