Python Packaging: Data files are put properly in t

2019-01-16 12:44发布

问题:

I can't properly install the project package_fiddler to my virtual environment.

I have figured out that MANIFEST.in is responsible for putting the non-.py files in Package_fiddler-0.0.0.tar.gz that is generated when executing python setup.py sdist.

Then I did:

(virt_envir)$ pip install dist/Package_fiddler-0.0.0.tar.gz

But this did not install the data files nor the package to /home/username/.virtualenvs/virt_envir/local/lib/python2.7/site-packages.

I have tried many configurations of the setup arguments package_data, include_package_data and data_files but I seem to have used the wrong configuration each time.

Which configuration of package_data and/or include_package_data and/or data_files will properly install package_fiddler to my virtual environment?

Project tree

.
├── MANIFEST.in
├── package_fiddler
│   ├── data
│   │   ├── example.html
│   │   └── stylesheets
│   │       └── example.css
│   └── __init__.py
├── README.rst
└── setup.py

setup.py

from setuptools import setup


setup(
    name='Package_fiddler',
    entry_points={
    'console_scripts': ['package_fiddler = package_fiddler:main', ],},
    long_description=open('README.rst').read(),
    packages=['package_fiddler',])

MANIFEST.in

include README.rst
recursive-include package_fiddler/data *

Which configurations of setup.py(with code base above) have I tried?

Configuration1

Adding:

package_data={"": ['package_fiddler/data/*',]}

Configuration2

Adding:

package_data={"": ['*.html', '*.css', '*.rst']}

Configuration3

Adding:

include_package_data=True

Configuration4

Adding:

package_data={"": ['package_fiddler/data',]}

Removing:

packages=['package_fiddler',]

Configuration5 (Chris's suggestion)

Adding:

package_data={"data": ['package_fiddler/data',]}

Removing:

packages=['package_fiddler',]

Configuration 6

Adding:

package_data={"": ['package_fiddler/data/*',]}

Removing:

packages=['package_fiddler',]

These configurations all result in no files at all being installed on /home/username/.virtualenvs/virt_envir/local/lib/python2.7/site-packages.

EDIT

Note to Toshio Kuratomi: In my original post I used the simplest tree structure where this problem occurs for clarity but in reality my tree looks more like the tree below. For that tree, strangely if I only put an __init__.py in stylesheets somehow all the data files in the texts folder are also installed correctly!!! This baffles me.

Tree 2 (This installs all data files properly somehow!!)

.
├── MANIFEST.in
├── package_fiddler
│   │── stylesheets
|   |     ├── __init__.py
|   |     ├── example.css  
|   |     └── other
|   |          └── example2.css
|   |__ texts
|   |     ├── example.txt  
|   |     └── other
|   |          └── example2.txt
│   └── __init__.py
├── README.rst
└── setup.py

回答1:

I personally dislike the way setuptools mixes code and data both conceptually and implementation-wise. I think that it's that implementation that is tripping you up here. For setuptools to find and use package_data it needs for the data to reside inside of a python package. A python package can be a directory but there needs to be a __init__.py file in the directory. So it looks like you need the following (empty is fine) files:

./package_fiddler/data/__init__.py
./package_fiddler/data/stylesheets/__init__.py


回答2:

Found a solution that worked for me here.

Using setuptools==2.0.2 I did:

setuptools.setup(
    ...
    packages=setuptools.find_packages(),
    include_package_data=True,  # use MANIFEST.in during install
    ...
)


回答3:

The easiest way to include package data in "setup.py" is like so:

package_data = {'<package name>': ['<path to data file within package dir>']}

So in your example:

package_data = {'package_fiddler': ['data/*', 'data/stylesheets/*']}

package_data is a dictionary where the keys are the names of the packages included in the installer. The values under these keys should be lists of specific file paths or globs/wildcards within the package directory.

You also need to include the flag:

zip_safe=False

in setup(...) if you want to be able to resolve file system paths to your data. Otherwise you can use pkg_resources to do this: http://peak.telecommunity.com/DevCenter/PythonEggs#accessing-package-resources

You definitely don't need an __init__.py file in the "data" directory - this directory is not a module and is not meant to be imported.



回答4:

This works for me. Hope it helps.

package_data={
    "package_fiddler": [
        '\*.\*',
        '\*/\*.\*',
        '\*/\*/\*.\*',
    ],
},


回答5:

use

package_data={"data": ['package_fiddler/data',]}

instead of

packages=['package_fiddler',]