I'm trying to install textract using the command of pip install textract and I'm getting the following error.
C:\Users\HP\PycharmProjects\CVParser\venv\Scripts>pip install textract
Collecting textract
Using cached https://files.pythonhosted.org/packages/e0/00/a9278b3672a31da06394eb588a16e96f8fce9f6ae0ed44cca18103d4aef5/textract-1.6.1.tar.gz
Collecting argcomplete==1.8.2 (from textract)
Using cached https://files.pythonhosted.org/packages/f0/0f/f965f1520e6ba24b63320919eecfbe3d03debd32402e0c61a08e8fa02d17/argcomplete-1.8.2-py2.py3-none-any.whl
Collecting chardet==2.3.0 (from textract)
Using cached https://files.pythonhosted.org/packages/7e/5c/605ca2daa5cf21c87690d8fe6ab05a6f2278c451f4ede6456dd26453f4bd/chardet-2.3.0-py2.py3-none-any.whl
Collecting python-pptx==0.6.5 (from textract)
Using cached https://files.pythonhosted.org/packages/f8/9c/30bc244cedc571307efe0780d8195ffed5b08f09c94d23f50d6d5144ebc7/python-pptx-0.6.5.tar.gz
Collecting docx2txt==0.6 (from textract)
Using cached https://files.pythonhosted.org/packages/aa/72/f02730ec3b0219d8f783a255416339b02ff8b6a300c817abf0505833212a/docx2txt-0.6.tar.gz
Collecting beautifulsoup4==4.5.3 (from textract)
Using cached https://files.pythonhosted.org/packages/af/a3/9e803f838b3eeb313d45d916d4387cda8572c92e1aafeb53fd43ddb5da2c/beautifulsoup4-4.5.3-py3-none-any.whl
Collecting xlrd==1.0.0 (from textract)
Using cached https://files.pythonhosted.org/packages/0c/b0/8946fe3f9c2690c164aaa88dfd43b56347d3cdeac34124b988acd1aaa151/xlrd-1.0.0-py3-none-any.whl
Collecting EbookLib==0.15 (from textract)
Using cached https://files.pythonhosted.org/packages/04/30/2cbf65fa9587a1ecc66a78eea91f9189ead8fdadd5e009115bce34529aa6/EbookLib-0.15.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\HP\AppData\Local\Temp\pip-install-lk9fc36f\EbookLib\setup.py", line 13, in <module>
long_description = open('README.md').read(),
File "C:\Program Files\Python37\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 1671: character maps to <undefined>
---------------------------------------- Command "python setup.py egg_info" failed with error code 1 in C:\Users\HP\AppData\Local\Temp\pip-install-lk9fc36f\EbookLib\
You are using pip version 10.0.1, however version 18.0 is available. You
should consider upgrading via the 'python -m pip install --upgrade
pip' command.
As it mentioned here I successfully upgraded pip using python -m pip install --upgrade pip command and pip install --upgrade setuptools.
Also by going through this link I installed EbookLib 0.15 with this command pip install -Iv ebooklib==0.15. Then it gave me the following exception.
Command "python setup.py egg_info" failed with error code 1 in C:\Users\HP\AppData\Local\Temp\pip-install-0fyox_v9\ebooklib\
Exception information:
Traceback (most recent call last):
File "c:\users\hp\pycharmprojects\cvparser\venv\lib\site-packages\pip-10.0.1-py3.7.egg\pip\_internal\basecommand.py", line 228, in main
status = self.run(options, args)
File "c:\users\hp\pycharmprojects\cvparser\venv\lib\site-packages\pip-10.0.1-py3.7.egg\pip\_internal\commands\install.py", line 291, in run
resolver.resolve(requirement_set)
File "c:\users\hp\pycharmprojects\cvparser\venv\lib\site-packages\pip-10.0.1-py3.7.egg\pip\_internal\resolve.py", line 103, in resolve
self._resolve_one(requirement_set, req)
File "c:\users\hp\pycharmprojects\cvparser\venv\lib\site-packages\pip-10.0.1-py3.7.egg\pip\_internal\resolve.py", line 257, in _resolve_one
abstract_dist = self._get_abstract_dist_for(req_to_install)
File "c:\users\hp\pycharmprojects\cvparser\venv\lib\site-packages\pip-10.0.1-py3.7.egg\pip\_internal\resolve.py", line 210, in _get_abstract_dist_for
self.require_hashes
File "c:\users\hp\pycharmprojects\cvparser\venv\lib\site-packages\pip-10.0.1-py3.7.egg\pip\_internal\operations\prepare.py", line 324, in prepare_linked_requirement
abstract_dist.prep_for_dist(finder, self.build_isolation)
File "c:\users\hp\pycharmprojects\cvparser\venv\lib\site-packages\pip-10.0.1-py3.7.egg\pip\_internal\operations\prepare.py", line 154, in prep_for_dist
self.req.run_egg_info()
File "c:\users\hp\pycharmprojects\cvparser\venv\lib\site-packages\pip-10.0.1-py3.7.egg\pip\_internal\req\req_install.py", line 486, in run_egg_info
command_desc='python setup.py egg_info')
File "c:\users\hp\pycharmprojects\cvparser\venv\lib\site-packages\pip-10.0.1-py3.7.egg\pip\_internal\utils\misc.py", line 698, in call_subprocess
% (command_desc, proc.returncode, cwd))
pip._internal.exceptions.InstallationError: Command "python setup.py egg_info" failed with error code 1 in C:\Users\HP\AppData\Local\Temp\pip-install-0fyox_v9\ebooklib\
I'm using python 3.7.0 and pip 18.0.
Is this a python version matter? Can any one help me to solve this please?
The version of
textract
onPyPi
hasEbookLib==0.15
as a requirement, so if you desperately want that version, then you will have to donwload the Ebooklib source from github and edit theREADME.md
to not contain unicode characters anymore.A simpler approach however would be to download the latest version of
textract
from its github page, since the requirement forEbooklib
has been changed toEbookLib==0.16
which solves the issue.To do this, simply download the source code, chagne into the directory and run
Note: Since you are using a
venv
, make sure, that you are running it with the corresponidngpip
version