Using pypdfocr library from within a Python script

2019-05-31 16:58发布

How can you run pypdfocr from within a Python script, as opposed to the command line?

This question How to call pypdfocr functions to use them in a python script? approaches the answer I want, but doesn't quite get there.

import pypdfocr
from pypdfocr import pypdfocr
from pypdfocr.pypdfocr import PyPDFOCR as pocr

filepath = 'C:/myfolder/myPDF.pdf'

newfile = pocr.run_conversion(filepath)

This throws an error:

Unbound method  run_conversion must be called with PyPDFOCR instance as first argument.

Can someone help me fill in the (likely obvious) missing piece?

标签: python pdf ocr
2条回答
地球回转人心会变
2楼-- · 2019-05-31 17:14

I made a system call with success.

cmd = "pypdfocr '"+str(file)+"'"
os.system(cmd)
查看更多
forever°为你锁心
3楼-- · 2019-05-31 17:37

The problem is that you are trying to run run_conversion without an object.

run_conversion is a method of the class PyPDFOCR. So you will need an object of that class to run the method.

Once you have made an PyPDFOCR object (for instance my_ocr), you should be able to write:

newfile = my_ocr.run_conversion(filepath)
查看更多
登录 后发表回答