Using pypdfocr library from within a Python script

2019-05-31 17:11发布

问题:

How can you run pypdfocr from within a Python script, as opposed to the command line?

This question How to call pypdfocr functions to use them in a python script? approaches the answer I want, but doesn't quite get there.

import pypdfocr
from pypdfocr import pypdfocr
from pypdfocr.pypdfocr import PyPDFOCR as pocr

filepath = 'C:/myfolder/myPDF.pdf'

newfile = pocr.run_conversion(filepath)

This throws an error:

Unbound method  run_conversion must be called with PyPDFOCR instance as first argument.

Can someone help me fill in the (likely obvious) missing piece?

回答1:

The problem is that you are trying to run run_conversion without an object.

run_conversion is a method of the class PyPDFOCR. So you will need an object of that class to run the method.

Once you have made an PyPDFOCR object (for instance my_ocr), you should be able to write:

newfile = my_ocr.run_conversion(filepath)


回答2:

I made a system call with success.

cmd = "pypdfocr '"+str(file)+"'"
os.system(cmd)


标签: python pdf ocr