beautifulsoup won't recognize lxml

2019-01-24 05:02发布

问题:

I'm attempting to use lxml as the parser for BeautifulSoup because the default one is MUCH slower, however i'm getting this error:

    soup = BeautifulSoup(html, "lxml")
  File "/home/rob/python/stock/local/lib/python2.7/site-packages/bs4/__init__.py", line 152, in __init__
    % ",".join(features))
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?

I have uninstalled and reinstalled lxml as well as beautifulsoup many times, however it still will not read it. I've tried reinstalled lxml dependencies as well and i'm still getting this.

I even made a new virtual environment and installed fresh everything and still get this error.

Anyone have any idea whats going on here?

Edits

Using latest versions of bs4 and lxml on Python 2.7.x on ubuntu desktop

i can import lxml but i cannot from lxml import etree that is returning:

  File "<stdin>", line 1, in <module>
ImportError: /usr/lib/x86_64-linux-gnu/libxml2.so.2: version `LIBXML2_2.9.0' not found (required by /home/rob/python/stock/local/lib/python2.7/site-packages/lxml/etree.so)

i have libxml however i'm not sure the version, but i installed and reinstalled the latest. also tried to manually install 2.9.0 and still nothing

回答1:

It looks like lxml has not been successfully installed. To install lxml on Ubuntu, run

sudo apt-get install libxslt1-dev libxml2

In virtualenv:

pip install --upgrade lxml
pip install cssselect


回答2:

Go to these pages:

  1. https://pypi.python.org/pypi/cssselect

  2. https://pypi.python.org/pypi/lxml/3.2.5

download the source files for both packages. Expand each of them into a different folder. Then in each folder locate the setup.py file and run the following command:

python setup.py install

You may run into some problems with lxml. If you get an error like

error: command 'gcc' failed with exit status 1

make sure you install libxml2-dev & libxslt1-dev using

sudo apt-get install libxml2-dev libxslt1-dev

Hopefully that should work.