Extract Text with its Font Details (Style and Size

2019-06-26 21:23发布

问题:

I am looking to Extract Text with its Font Details (Style and Size) from a PDF in Python.

I need to read/parse the text content and also get the font details. Please suggest.

回答1:

There is a python library for that. Please have a look at PDFMiner.

http://www.unixuser.org/~euske/python/pdfminer/index.html.

pdftext.py gives you the text extracted out of pdf and it also gives you other information like font and font size etc.

You can try that.

Note: Python 3 is not supported