Best way to convert HTML to plaintext using Python

2019-03-27 01:42发布

问题:

I'm working on a project that involves converting a large amount of HTML content to plain/text. I have a custom-written module that does the job OK, but I'm wondering if there's some standard tools to help get the job done.

回答1:

Html2Text seems to be a good option



回答2:

Here's a python library which does HTML parsing:

  • lxml.html

BeautifulSoup is another option.