I am using pytesseract
lib to extract text from image. This works fine when I am running code on localhost. But gives me above error when I deploy on openshift.
Below is code what I have written so far.
try:
import Image
except ImportError:
from PIL import Image
import pytesseract
filePath = PATH_WHERE_FILE_IS_LOCATED # '/var/lib/openshift/555.../app-root/data/data/y.jpg'
text = pytesseract.image_to_string(Image.open(filePath)) # this line produces error
Traceback of above error is
>>> pytesseract.image_to_string(Image.open(filePath))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/var/lib/openshift/56faaee42d527151d5000089/app- root/runtime/repo/pytesseract/pytesseract.py", line 132, in image_to_string
boxes=boxes)
File "/var/lib/openshift/56faaee42d527151d5000089/app-root/runtime/repo/pytesseract/pytesseract.py", line 73, in run_tesseract
stderr=subprocess.PIPE)
File "/opt/rh/python27/root/usr/lib64/python2.7/subprocess.py", line 710, in __init__
errread, errwrite)
File "/opt/rh/python27/root/usr/lib64/python2.7/subprocess.py", line 1327, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
But Image.open(filePath)
returns object reference
<PIL.PngImagePlugin.PngImageFile image mode=RGBA size=1366x768 at 0x7FC5A9F719D0>
How to remove this error ? thanks in advance!!
Either you don't have tesseract-ocr installed on "openshift", or it is not in your PATH. See https://pypi.python.org/pypi/pytesseract/0.1
Check that you can execute tesseract command from command line.
As mentioned here install tesseract-ocr
You can rhc ssh to run commands. More windows specific details can be found here.
IMHO and if i understand well openshift, it maybe like Heroku, where the filesystems are volatile and the paths must be from slightly different or totally different,so, at first check:
- the paths are the same as in your local dev environment
- the paths exist
- you have enough rights to access the files in paths
- Please check openshift docs, file system specially:
I hope i was helpfull
Try this code, and check where is the error:
try:
import Image
print("image not from PIL")
except ImportError:
print("image from PIL")
from PIL import Image
import pytesseract
import os
filePath = PATH_WHERE_FILE_IS_LOCATED # '/var/lib/openshift/555.../app-root/data/data/y.jpg'
if not os.path.exist(filePath):
print("no image file")
I=None
try:
I=Image.open(filePath)
except Exception as e:
raise RuntimeError(" Can't open image because %s"% e)
text = pytesseract.image_to_string(I) # this line produces error
PS:
You can print modules versions like this:
print Image.__version__
I think you may have not entered the correct path to the image. You should keep your paths in check.
Also have you verified the installation of tesseract-ocr?
You should see that no errors are produced when you call the module using the import function and by checking it from the command line utility.
And as Wuelfhis Asuaje says you should make sure you have enough rights to access the files in the path.
You should install google tesseract-ocr from http://code.google.com/p/tesseract-ocr/.
Make sure the tesseract
command is available on the server.
Under the hood, pytesseract
invokes the tesseract
command with subprocess
(https://github.com/madmaze/pytesseract/blob/master/src/pytesseract.py#L93):
proc = subprocess.Popen(command,
stderr=subprocess.PIPE)
Now guess what happens if the command is not available?
In [45]: subprocess.Popen(['tesseract'])
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-45-f4e9dd5a7f0b> in <module>()
----> 1 subprocess.Popen(['tesseract'])
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.pyc in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags)
708 p2cread, p2cwrite,
709 c2pread, c2pwrite,
--> 710 errread, errwrite)
711 except Exception:
712 # Preserve original exception in case os.close raises.
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.pyc in _execute_child(self, args, executable, preexec_fn, close_fds, cwd, env, universal_newlines, startupinfo, creationflags, shell, to_close, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite)
1333 raise
1334 child_exception = pickle.loads(data)
-> 1335 raise child_exception
1336
1337
OSError: [Errno 2] No such file or directory