Python setuptools/distutils custom build for the `

2019-02-05 20:36发布

问题:

Preamble: Python setuptools are used for the package distribution. I have a Python package (let us call it my_package), that has several extra_require packages to it. Everything works just find (installation and build of the package, as well as extras, if were requested), as all extra_require were python packages themselves and pip correctly resolved everything. A simple pip install my_package worked like a charm.

Setup: Now, for one of the extras (let us call it extra1) I need to call a binary of a non-python library X.

Module X itself (source code) was added to the my_package codebase and was included in the distribution my_package. Sadly for me, to be utilized, X needs to be compiled first into a binary on the target machine (C++ implementation; I assume such compilation shall happen on the build stage of my_package installation). There is a Makefile in the X library optimized for different platform compilation, so all that is needed, is to run make in the respective directory of X library in the my_package when the build process is running.

Question #1: how to run a terminal command (i.e., make in my case) during the build process of the package, using setuptools/distutils?

Question #2: how to ensure, that such terminal command is executed only if the corresponding extra1 is specified during the installation process?

Example:

  1. If someone runs pip install my_package, no such additional compilation of library X shall happen.
  2. If someone runs pip install my_package [extra1], module X needs to be compiled, so the corresponding binary would be created and available on the target machine.

回答1:

This question came back to haunt me long after I commented on it two years ago! I had almost the same problem myself recently, and I found the documentation VERY scarce, as I think most of you must have experienced. So I tried to research a bit of the source code of setuptools and distutils to see if I could find a more or less standard approach to both the questions you asked.


The first question you asked

Question #1: how to run a terminal command (i.e., make in my case) during the build process of the package, using setuptools/distutils?

has many approaches and all of them involve setting a cmdclass when calling setup. The parameter cmdclass of setup must be a mapping between command names that will execute depending on the build or install needs of the distribution, and classes that inherit from distutils.cmd.Command base class (as a side note, the setuptools.command.Command class is derived from distutils' Command class so you can derive directly from setuptools implementation).

The cmdclass allows you to define any command name, like what ayoon did and then execute it specifically when calling python setup.py --install-option="customcommand" from the command line. The problem with this, is that it is not the standard command that will be executed when trying to install a package through pip or by calling python setup.py install. The standard way to approach this is to check what commands will setup try to execute in a normal install and then overload that particular cmdclass.

From looking into setuptools.setup and distutils.setup, setup will run the commands it found in the command line, which lets assume is just a plain install. In the case of setuptools.setup, this will trigger a series of tests that will see whether to resort to a simple call to the distutils.install command class, and if this does not occur, it will attempt to run bdist_egg. In turn, this command does many things but crucially decides on whether to call the build_clib, build_py and/or the build_ext commands. The distutils.install simply runs build if necessary which also runs build_clib, build_py and/or build_ext. This means that regardless of whether you use setuptools or distutils, if it is necessary to build from source, the commands build_clib, build_py, and/or build_ext will be runned, so these are the ones that we will want to overload with the cmdclass of setup, the question becomes which of the three.

  • build_py is used to "build" pure python packages, so we can safely ignore it.
  • build_ext is used to build declared Extension modules that are passed through the ext_modules parameter of the call to the setup function. If we wish to overload this class, the main method that builds each extension is build_extension (or here for distutils)
  • build_clib is used to build declared libraries that are passed through the libraries parameter of the call to the setup function. In this case, the main method that we should overload with our derived class is the build_libraries method (here for distutils).

I'll share an example package that builds a toy c static library through a Makefile by using setuptools build_ext command. The approach can be adapted to using the build_clib command, but you'll have to checkout the source code of build_clib.build_libraries.

setup.py

import os, subprocess
import setuptools
from setuptools.command.build_ext import build_ext
from distutils.errors import DistutilsSetupError
from distutils import log as distutils_logger


extension1 = setuptools.extension.Extension('test_pack_opt.test_ext',
                    sources = ['test_pack_opt/src/test.c'],
                    libraries = [':libtestlib.a'],
                    library_dirs = ['test_pack_opt/lib/'],
                    )

class specialized_build_ext(build_ext, object):
    """
    Specialized builder for testlib library

    """
    special_extension = extension1.name

    def build_extension(self, ext):

        if ext.name!=self.special_extension:
            # Handle unspecial extensions with the parent class' method
            super(specialized_build_ext, self).build_extension(ext)
        else:
            # Handle special extension
            sources = ext.sources
            if sources is None or not isinstance(sources, (list, tuple)):
                raise DistutilsSetupError(
                       "in 'ext_modules' option (extension '%s'), "
                       "'sources' must be present and must be "
                       "a list of source filenames" % ext.name)
            sources = list(sources)

            if len(sources)>1:
                sources_path = os.path.commonprefix(sources)
            else:
                sources_path = os.path.dirname(sources[0])
            sources_path = os.path.realpath(sources_path)
            if not sources_path.endswith(os.path.sep):
                sources_path+= os.path.sep

            if not os.path.exists(sources_path) or not os.path.isdir(sources_path):
                raise DistutilsSetupError(
                       "in 'extensions' option (extension '%s'), "
                       "the supplied 'sources' base dir "
                       "must exist" % ext.name)

            output_dir = os.path.realpath(os.path.join(sources_path,'..','lib'))
            if not os.path.exists(output_dir):
                os.makedirs(output_dir)

            output_lib = 'libtestlib.a'

            distutils_logger.info('Will execute the following command in with subprocess.Popen: \n{0}'.format(
                  'make static && mv {0} {1}'.format(output_lib, os.path.join(output_dir, output_lib))))


            make_process = subprocess.Popen('make static && mv {0} {1}'.format(output_lib, os.path.join(output_dir, output_lib)),
                                            cwd=sources_path,
                                            stdout=subprocess.PIPE,
                                            stderr=subprocess.PIPE,
                                            shell=True)
            stdout, stderr = make_process.communicate()
            distutils_logger.debug(stdout)
            if stderr:
                raise DistutilsSetupError('An ERROR occured while running the '
                                          'Makefile for the {0} library. '
                                          'Error status: {1}'.format(output_lib, stderr))
            # After making the library build the c library's python interface with the parent build_extension method
            super(specialized_build_ext, self).build_extension(ext)


setuptools.setup(name = 'tester',
       version = '1.0',
       ext_modules = [extension1],
       packages = ['test_pack', 'test_pack_opt'],
       cmdclass = {'build_ext': specialized_build_ext},
       )

test_pack/__init__.py

from __future__ import absolute_import, print_function

def py_test_fun():
    print('Hello from python test_fun')

try:
    from test_pack_opt.test_ext import test_fun as c_test_fun
    test_fun = c_test_fun
except ImportError:
    test_fun = py_test_fun

test_pack_opt/__init__.py

from __future__ import absolute_import, print_function
import test_pack_opt.test_ext

test_pack_opt/src/Makefile

LIBS =  testlib.so testlib.a
SRCS =  testlib.c
OBJS =  testlib.o
CFLAGS = -O3 -fPIC
CC = gcc
LD = gcc
LDFLAGS =

all: shared static

shared: libtestlib.so

static: libtestlib.a

libtestlib.so: $(OBJS)
    $(LD) -pthread -shared $(OBJS) $(LDFLAGS) -o $@

libtestlib.a: $(OBJS)
    ar crs $@ $(OBJS) $(LDFLAGS)

clean: cleantemp
    rm -f $(LIBS)

cleantemp:
    rm -f $(OBJS)  *.mod

.SUFFIXES: $(SUFFIXES) .c

%.o:%.c
    $(CC) $(CFLAGS) -c $<

test_pack_opt/src/test.c

#include <Python.h>
#include "testlib.h"

static PyObject*
test_ext_mod_test_fun(PyObject* self, PyObject* args, PyObject* keywds){
    testlib_fun();
    return Py_None;
}

static PyMethodDef TestExtMethods[] = {
    {"test_fun", (PyCFunction) test_ext_mod_test_fun, METH_VARARGS | METH_KEYWORDS, "Calls function in shared library"},
    {NULL, NULL, 0, NULL}
};

#if PY_VERSION_HEX >= 0x03000000
    static struct PyModuleDef moduledef = {
        PyModuleDef_HEAD_INIT,
        "test_ext",
        NULL,
        -1,
        TestExtMethods,
        NULL,
        NULL,
        NULL,
        NULL
    };

    PyMODINIT_FUNC
    PyInit_test_ext(void)
    {
        PyObject *m = PyModule_Create(&moduledef);
        if (!m) {
            return NULL;
        }
        return m;
    }
#else
    PyMODINIT_FUNC
    inittest_ext(void)
    {
        PyObject *m = Py_InitModule("test_ext", TestExtMethods);
        if (m == NULL)
        {
            return;
        }
    }
#endif

test_pack_opt/src/testlib.c

#include "testlib.h"

void testlib_fun(void){
    printf("Hello from testlib_fun!\n");
}

test_pack_opt/src/testlib.h

#ifndef TESTLIB_H
#define TESTLIB_H

#include <stdio.h>

void testlib_fun(void);

#endif

In this example, the c library that I want to build using the custom Makefile just has one function which prints "Hello from testlib_fun!\n" to stdout. The test.c script is a simple interface between python and this library's single function. The idea is that I tell setup that I want to build a c extension named test_pack_opt.test_ext, which only has a single source file: the test.c interface script, and I also tell the extension that it must link against the static library libtestlib.a. The main thing is that I overload the build_ext cmdclass using specialized_build_ext(build_ext, object). The inheritance from object is only necessary if you want to be able to call super to dispatch to parent class methods. The build_extension method takes an Extension instance as its second argument, in order to work nice with other Extension instances that require the default behavior of build_extension, I check if this extension has the name of the special one and if it doesn't I call the super's build_extension method.

For the special library, I call the Makefile simply with subprocess.Popen('make static ...'). The rest of the command passed to the shell is just to move the static library to a certain default location in which the library should be found to be able to link it to the rest of the compiled extension (which is also just compiled using the super's build_extension method).

As you can imagine there are just sooo many ways in which you could organize this code differently, it does not make sense to list them all. I hope this example serves to illustrate how to call the Makefile, and which cmdclass and Command derived class you should overload to call make in a standard installation.


Now, onto question 2.

Question #2: how to ensure, that such terminal command is executed only if the corresponding extra1 is specified during the installation process?

This was possible with the deprecated features parameter of setuptools.setup. The standard way is to try to install the package depending on the requirements that are met. install_requires lists the mandatory requirements, the extras_requires lists the optional requirements. For example from the setuptools documentation

setup(
    name="Project-A",
    ...
    extras_require={
        'PDF':  ["ReportLab>=1.2", "RXP"],
        'reST': ["docutils>=0.3"],
    }
)

you could force the installation of the optional required packages by calling pip install Project-A[PDF], but if for some reason the requirements for the 'PDF' named extra were satisfied before hand, pip install Project-A would end up with the same "Project-A" functionality. This means that the way in which "Project-A" is installed is not customized for each extra specified at the command line, "Project-A" will always try to install in the same way and may end up with reduced functionality because of unavailable optional requirements.

From what I understood, this means that in order to get your module X to be compiled and installed only if [extra1] is specified, you should ship module X as a separate package and depend on it through an extras_require. Lets imagine module X will be shipped in my_package_opt, your setup for my_package should look like

setup(
    name="my_package",
    ...
    extras_require={
        'extra1':  ["my_package_opt"],
    }
)

Well, I'm sorry that my answer ended up being so long but I hope it helps. Don't hesitate in pointing out any conceptual or naming error, as I mostly tried to deduce this from the setuptools source code.



回答2:

Unfortunately, the docs are extremely scarce around the interaction between setup.py and pip, but you should be able to do something like this:

import subprocess

from setuptools import Command
from setuptools import setup


class CustomInstall(Command):

    user_options = []

    def initialize_options(self):
        pass

    def finalize_options(self):
        pass

    def run(self):
        subprocess.call(
            ['touch',
             '/home/{{YOUR_USERNAME}}/'
             'and_thats_why_you_should_never_run_pip_as_sudo']
        )

setup(
    name='hack',
    version='0.1',
    cmdclass={'customcommand': CustomInstall}
)

This gives you a hook into running arbitrary code with commands, and also supports a variety of custom option parsing (not demonstrated here).

Put this in a setup.py file and try this:

pip install --install-option="customcommand" .

Note that this command is executed after the main install sequence, so depending on exactly what you're trying to do, it may not work. See the verbose pip install output:

(.venv) ayoon:tmp$ pip install -vvv --install-option="customcommand" .
/home/ayoon/tmp/.venv/lib/python3.6/site-packages/pip/commands/install.py:194: UserWarning: Disabling all use of wheels due to the use of --build-options / -
-global-options / --install-options.                                                                                                                        
  cmdoptions.check_install_build_global(options)
Processing /home/ayoon/tmp
  Running setup.py (path:/tmp/pip-j57ovc7i-build/setup.py) egg_info for package from file:///home/ayoon/tmp
    Running command python setup.py egg_info
    running egg_info
    creating pip-egg-info/hack.egg-info
    writing pip-egg-info/hack.egg-info/PKG-INFO
    writing dependency_links to pip-egg-info/hack.egg-info/dependency_links.txt
    writing top-level names to pip-egg-info/hack.egg-info/top_level.txt
    writing manifest file 'pip-egg-info/hack.egg-info/SOURCES.txt'
    reading manifest file 'pip-egg-info/hack.egg-info/SOURCES.txt'
    writing manifest file 'pip-egg-info/hack.egg-info/SOURCES.txt'
  Source in /tmp/pip-j57ovc7i-build has version 0.1, which satisfies requirement hack==0.1 from file:///home/ayoon/tmp
Could not parse version from link: file:///home/ayoon/tmp
Installing collected packages: hack
  Running setup.py install for hack ...     Running command /home/ayoon/tmp/.venv/bin/python3.6 -u -c "import setuptools, tokenize;__file__='/tmp/pip-j57ovc7
i-build/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --
record /tmp/pip-_8hbltc6-record/install-record.txt --single-version-externally-managed --compile --install-headers /home/ayoon/tmp/.venv/include/site/python3
.6/hack customcommand                                                                                                                                       
    running install
    running build
    running install_egg_info
    running egg_info
    writing hack.egg-info/PKG-INFO
    writing dependency_links to hack.egg-info/dependency_links.txt
    writing top-level names to hack.egg-info/top_level.txt
    reading manifest file 'hack.egg-info/SOURCES.txt'
    writing manifest file 'hack.egg-info/SOURCES.txt'
    Copying hack.egg-info to /home/ayoon/tmp/.venv/lib/python3.6/site-packages/hack-0.1-py3.6.egg-info
    running install_scripts
    writing list of installed files to '/tmp/pip-_8hbltc6-record/install-record.txt'
    running customcommand
done
  Removing source in /tmp/pip-j57ovc7i-build
Successfully installed hack-0.1