Summary
I am working on a series of add-ons for Anki, an open-source flashcard program. Anki add-ons are shipped as Python packages, with the basic folder structure looking as follows:
anki_addons/
addon_name_1/
__init__.py
addon_name_2/
__init__.py
anki_addons
is appended to sys.path
by the base app, which then imports each add_on with import <addon_name>
.
The problem I have been trying to solve is to find a reliable way to ship packages and their dependencies with my add-ons while not polluting global state or falling back to manual edits of the vendored packages.
Specifics
Specifically, given an add-on structure like this...
addon_name_1/
__init__.py
_vendor/
__init__.py
library1
library2
dependency_of_library2
...
...I would like to be able to import any arbitrary package that is included in the _vendor
directory, e.g.:
from ._vendor import library1
The main difficulty with relative imports like this is that they do not work for packages that also depend on other packages imported through absolute references (e.g. import dependency_of_library2
in the source code of library2
)
Solution attempts
So far I have explored the following options:
- Manually updating the third-party packages, so that their import statements point to the fully qualified module path within my python package (e.g.
import addon_name_1._vendor.dependency_of_library2
). But this is tedious work that is not scalable to larger dependency trees and not portable to other packages. - Adding
_vendor
tosys.path
viasys.path.insert(1, <path_to_vendor_dir>)
in my package init file. This works, but it introduces a global change to the module look-up path which will affect other add-ons and even the base app itself. It just seems like a hack that could result in a pandora's box of issues later down the line (e.g. conflicts between different versions of the same package, etc.). - Temporarily modifying sys.path for my imports; but this fails to work for third-party modules with method-level imports.
- Writing a PEP302-style custom importer based off an example I found in setuptools, but I just couldn't make head nor tail of that.
I've been stuck on this for quite a few hours now and I'm beginning to think that I'm either completely missing an easy way to do this, or that there is something fundamentally wrong with my entire approach.
Is there no way I can ship a dependency tree of third-party packages with my code, without having to resort to sys.path
hacks or modifying the packages in question?
Edit:
Just to clarify: I don't have any control over how add-ons are imported from the anki_addons folder. anki_addons is just the directory provided by the base app where all add-ons are installed into. It is added to the sys path, so the add-on packages therein pretty much just behave like any other python package located in Python's module look-up paths.
First of all, I'd advice against vendoring; a few major packages did use vendoring before but have switched away to avoid the pain of having to handle vendoring. One such example is the
requests
library. If you are relying on people usingpip install
to install your package, then just use dependencies and tell people about virtual environments. Don't assume you need to shoulder the burden of keeping dependencies untangled or need to stop people from installing dependencies in the global Pythonsite-packages
location.At the same time, I appreciate that a plug-in environment of a third-party tool is something different, and if adding dependencies to the Python installation used by that tool is cumbersome or impossible vendorizing may be a viable option. I see that Anki distributes extensions as
.zip
files without setuptools support, so that's certainly such an environment.So if you choose to vendor dependencies, then use a script to manage your dependencies and update their imports. This is your option #1, but automated.
This is the path that the
pip
project has chosen, see theirtasks
subdirectory for their automation, which builds on theinvoke
library. See the pip project vendoring README for their policy and rationale (chief among those is thatpip
needs to bootstrap itself, e.g. have their dependencies available to be able to install anything).You should not use any of the other options; you already enumerated the issues with #2 and #3.
The issue with option #4, using a custom importer, is that you still need to rewrite imports. Put differently, the custom importer hook used by
setuptools
doesn't solve the vendorized namespace problem at all, it instead makes it possible to dynamically import top-level packages if the vendorized packages are missing (a problem thatpip
solves with a manual debundling process).setuptools
actually uses option #1, where they rewrite the source code for vendorized packages. See for example these lines in thepackaging
project in thesetuptools
vendored subpackage; thesetuptools.extern
namespace is handled by the custom import hook, which then redirects either tosetuptools._vendor
or the top-level name if importing from the vendorized package fails.The
pip
automation to update vendored packages takes the following steps:_vendor/
subdirectory except the documentation, the__init__.py
file and the requirements text file.pip
to install all vendored dependencies into that directory, using a dedicated requirements file namedvendor.txt
, avoiding compilation of.pyc
bytecache files and ignoring transient dependencies (these are assumed to be listed invendor.txt
already); the command used ispip install -t pip/_vendor -r pip/_vendor/vendor.txt --no-compile --no-deps
.pip
but not needed in a vendored environment, i.e.*.dist-info
,*.egg-info
, thebin
directory, and a few things from installed dependencies thatpip
would never use..py
extension (so anything not in the whitelist); this is thevendored_libs
list.vendored_lists
is used to replaceimport <name>
occurrences withimport pip._vendor.<name>
and everyfrom <name>(.*) import
occurrence withfrom pip._vendor.<name>(.*) import
.pip
patch forrequests
is interesting here in that it updates therequests
library backwards compatibility layer for the vendored packages that therequests
library had removed; this patch is quite meta!So in essence, the most important part of the
pip
approach, the rewriting of vendored package imports is quite simple; paraphrased to simplify the logic and removing thepip
specific parts, it is simply the following process:The best way to bundle dependencies is to use a
virtualenv
. TheAnki
project should at least be able to install inside one.I think what you are after is
namespace packages
.https://packaging.python.org/guides/packaging-namespace-packages/
I would imagine that the main Anki project has a
setup.py
and every add-on has its ownsetup.py
and can be installed from its own source distribution. Then the add-ons can list their dependencies in their ownsetup.py
and pip will install them insite-packages
.Namespace packages only solves part of the problem and as you said you don't have any control over how add-ons are imported from the anki_addons folder. I think designing how add-ons are imported and packaging them goes hand-in-hand.
The
pkgutil
module provides a way for the main project to discovered the installed add-ons. https://packaging.python.org/guides/creating-and-discovering-plugins/A project that uses this extensively is Zope. http://www.zope.org
Have a look here: https://github.com/zopefoundation/zope.interface/blob/master/setup.py
How about making your
anki_addons
folder a package and importing the the required libraries to__init__.py
in the main package folder.So it'd be something like
In
anki.__init__.py
:In
anki.anki_addons.__init__.py
:I'm new at this, so please bear with me here.