Alternative implementations of python/setuptools e

2019-03-08 22:56发布

问题:

While this question has a python backend, the question is not tied to python itself, but rather about extension mechanisms and how to register/lookup for plugins.

In python, the concept of entrypoints was introduced by setuptools, and is tied to the metadata of installed python distributions (called packages in other packaging systems).

As of my understanding, one of the features provided by entrypoints is to allows an application to define a place where others can put things in, so any application wanting to use an entrypoint can get a list of registered classes/functions there. Let's take an example:

  • Foo defines the entrypoint "entrypoint1" and look for plugins registered under this name.
  • Bar register a callable (Bar.callable) on the "entrypoint1" entrypoint.
  • Any python script is then able to list Bar.callable as one of the registered callables for "entrypoint1".

With setuptools, applications registers entrypoints at install time and the information is stored in metadata related to packaging, called .egginfo (which usually contain information about the distribution name, its dependencies, and some more metadata about packaging).

I have the feeling that packaging metadata is not the right place to store this kind of information, as I don't get why this information is tied to packaging.

I'm curious to hear about such entrypoints/extensions/plugins features in other languages, and especially if the concept is tied to metadata and packaging or not. And so the question is…

Do you have examples I should look at? Could you explain why the design choices have been made that way?

Can you see different ways to handle this problem? Do you know how this problem has already been solved in different tools? What are the downsides and the advantages of the current python implementation over others?


What I've found so far

I've found in different projects, a way to create and to distribute "plugins", which are especially taking care about the "how do we make plugins".

For instance, libpeas (the gobject plugin framework) defines a set of ways to extend a default behavior by specifying plugins. While this is interesting, I am just interested by the "registering and finding" (and eventually loading) part of it.

Here are some of my findings so far:

Libpeas defines its own metadata file (*.plugin), which stores information about the type of the callable (it is possible to have different plugins in different languages). The main information here is the name of the module to load.

Maven have a design document containing information on how stuff is managed there. Maven manages plugins with their dependencies and metadata so it seems like an interesting place to look for how they implemented things.

As specified in their documentation, maven plugins are using annotations (@goal) on the classes, which is then used to find all the plugins registered with a particular @goal. While this approach is possible in static languages, it is not in interpreted languages as we only know what are all the possible classes / callables at one given point of the time, which may change.

Mercurial uses a central configuration file (~/.hgrc), containing a mapping of the name of the plugin to the path it can be found.


Some more thoughts

While that's not an answer to this question, it is also interesting to note how setuptools entrypoints are implemented, and how they compare in term of performance, with the mercurial ones.

When you ask for a particular entry point with setuptools, all the metadata are read at run time and a list is built that way. This means this reading can take some time if you have a lot of python distributions on your path. Mercurial on the other side hardcode this information into a single file, which means you have to specify the complete path to your callable there, then registered callables are not "discovered" but "read" directly from the configuration file. This allows finer grained configuration on what's should be available and what should not be and seems faster.

On the other side, as the python path can be changed at runtime, this means that callables provided on such a way would have to be checked against the path in order to know if they should be returned or not in all situations.


Why entrypoints are currently tied to packaging

It is also interesting to understand why entrypoints are tied to packaging in setuptools. The main reason is that it seems useful that python distributions can register part of themselves as extending an entrypoint at installation time: then installing means also registering the entrypoints: no need for an extra registration step.

While this works fairly well in a large majority of cases (when python distributions are actually being installed) it doesn't when they are not installed or simply not packaged. In other words, from what I understand, you can't register an entrypoint at runtime, without having an .egg-info file.

回答1:

as one of possible ideas, you can look at OSGi concept, that is used for plugin system management of Eclipse. It could be overkill for your specific case, but definitely the source of inspiration.



回答2:

As you started out speaking about programming language implementation / processing, it may be worth noting that recently gcc acquired plugin capabilities too.



回答3:

In python, there is the "PYTHONPATH" which is an environment variable, that defines a place from where the "imports" will look at code pieces.

If you have a

 PYTHONPATH=/home/user/tests

all subdirs in

 /home/user/tests 

have their own

 __init__.py 

file, listing the modules that can be found there.

For example : in the

 /home/user/tests/__init__.py

you have

 __all__ =  ["computeSomething", "computeSomthingElse"]

referring to the existing files :

 /home/user/tests/computeSomething.py
 /home/user/tests/computeSomethingElse.py

then, in the python, you can immediately

 import computeSomethingElse

or

 from computeSomethingElse import myObjectFromSomethingElse.

I hope this helps.