Understand foreign function interface (FFI) and la

2019-03-08 08:29发布

问题:

Mixing different programming languages has long been something I don't quite understand. According to this Wikipedia article, a foreign function interface (or FFI) can be done in several ways:

  1. Requiring that guest-language functions which are to be host-language callable be specified or implemented in a particular way; often using a compatibility library of some sort.
  2. Use of a tool to automatically "wrap" guest-language functions with appropriate glue code, which performs any necessary translation.
  3. Use of wrapper libraries
  4. Restricting the set of host language capabilities which can be used cross-language. For example, C++ functions called from C may not (in general) include reference parameters or throw exceptions.

My questions:

  1. What are the differences between the 1st, 2nd and 3rd ways? It seems to me they are all to compile the code of the called language into some library with object files and header files, which are then called by the calling language.
  2. One source it links says, implementing an FFI can be done in several ways:

    • Requiring the called functions in the target language implement a specific protocol.
    • Implementing a wrapper library that takes a given low-language function, and "wraps" it with code to do data conversion to/from the high-level language conventions.
    • Requiring functions declared native to use a subset of high-level functionality (which is compatible with the low-level language).

    I was wondering if the first way in the linked source is the same as the first way in Wikipedia?

    What does the third way in this source mean? Does it corresponds to the 4th way in Wikipedia?

  3. In the same source, when comparing the three ways it lists, it seems to say the job of filling the gap between the two languages is gradually shifted from the called language to the calling language. I was wondering how to understand that? Is this shifting also true for the four ways in Wikipedia?
  4. Are Language binding and FFI equivalent concepts? How are they related and differ?

    a binding from a programming language to a library or OS service is an API providing that service in the language.

  5. I was wondering which way in the quotation from Wikipedia or from the source each of the following examples belongs to?

    • Common Object Request Broker Architecture (CORBA)
    • Calling C in C++, by the extern "C" declaration in C++ to disable name mangling.
    • Calling C in Matlab, by MATLAB Interface to Shared Libraries, i.e., first compiling C code to shared library via general C compiler such as gcc, and then loading, calling a function from and unloading the shared library via Matlab functions loadlibrary(), calllib() and unloadlibrary().
    • Calling C in Matlab, by Creating C/C++ Language MEX-Files
    • Calling Matlab in C, by mcc compiler
    • Calling C++ in Java, by JNI, and Calling Java in C++, also by JNI
    • Calling C/C++ in other languages, Using SWIG
    • Calling C in Python, by Ctypes module.
    • Cython
    • Calling R in Python, by RPy
    • Programming Language Bindings to OpenGL from various languages, such as Python, Fortran and Java
    • Bindings for a C library, such as Cairo, from various languages, such as C++, Python, Java, Common Lisp

Thanks for your enlightenment! Best regards!

回答1:

May be a specific example will help. Let us take the host language as Python and the guest language as C. This means that Python will be calling C functions.

  1. The first option is to write the C library in a particular way. In the case of Python the standard way would be to have the C function written with a first parameter of Py_Object * among other conditions. For example (from here):

    static PyObject *
    spam_system(PyObject *self, PyObject *args)
    {
        const char *command;
        int sts;
    
        if (!PyArg_ParseTuple(args, "s", &command))
            return NULL;
        sts = system(command);
        return Py_BuildValue("i", sts);
    }
    

    is a C function callable from Python. For this to work the library has to be written with Python compatibility in mind.

  2. If you want to use an already existing C library, you need another option. One is to have a tool that generates wraps this existing library in a format suitable for consumption by the host language. Take Swig which can be used to tie many languages. Given an existing C library you can use swig to effectively generate C code that calls your existing library while conforming to Python conventions. See the example for building a Python module.

  3. Another option to us an already existing C library is to call it from a Python library that effectively wraps the calls at run time, like ctypes. While in option 2 compilation was necessary, it is not this time.

Another thing is that there are a lot of options (which do overlap) for calling functions in one language from another language. There are FFIs (equivalent to language bindings as far as I know) which usually refer to calling between multiple languages in the same process (as part of the same executable, so to speak), and there are interprocess communication means (local and network). Things like CORBA and Web Services (SOAP or REST) and and COM+ and remote procedure calls in general are of the second category and are not seen as FFI. In fact, they mostly don't prescribe any particular language to be used at either side of the communication. I would loosely put them as IPC (interprocess communication) options, though this is simplification in the case of network based APi like CORBA and SOAP.

Having a go at your list, I would venture the following opinions:

  • Common Object Request Broker Architecture: IPC, not FFI
  • Calling C in C++, by the extern "C" declaration in C++ to disable name mangling. ****
  • Calling C in Matlab, by MATLAB Interface to Shared Libraries Option 3 (ctypes-like)
  • Calling C in Matlab, by Creating C/C++ Language MEX-Files Option 2 (swig-like)
  • Calling Matlab in C, by mcc compiler Option 2 (swig-like)
  • Calling C++ in Java, by JNI, and Calling Java in C++ by JNI Option 3 (ctypes-like)
  • Calling C/C++ in other languages, Using SWIG Option 2 (swig)
  • Calling C in Python, by Ctypes Option 3 (ctypes)
  • Cython Option 2 (swig-like)
  • Calling R in Python, by RPy Option 3 (ctypes-like) in part, and partly about data exchange (not FFI)

The next two are not foreign function interfaces at all, as the term is used. FFi is about the interaction between to programming languages and should be capable of making any library (with suitable restrictions) from one language available to the other. A particular library being accessible from one language does not an FFI make.

  • Programming Language Bindings to OpenGL from various languages
  • Bindings for a C library from various languages