I am trying to deploy an application on the Google App Engine that also has OCR function. I downloaded the tesseract using homebrew and using pytesseract
to wrap in Python. The OCR function works on my local system, but it does not when I upload the application to the Google App Engine.
I copied tesseract
folder from usr/local/cellar/tesseract and pasted into the working directory of my app. I uploaded the tesseract files and also pytesseract
files to app engine. I have specified the path for tesseract with os.getcwd()
so that pytesseract
can find it. Nevertheless, this does not work. App engine cannot find the file to execute, since they are not in the same directory (os.getcwd()
) .
Code from pytesseract.py
cmda = os.getcwd()
# CHANGE THIS IF TESSERACT IS NOT IN YOUR PATH, OR IS NAMED DIFFERENTLY
def find_all(name, path):
result = []
for root, dirs, files in os.walk(path):
if name in files:
result.append(os.path.join(root, name))
return result
founds = find_all("tesseract",cmda)
tesseract_cmd = founds[0]
The error from Google App Engine is:
tesseract is not installed on your path.
The Google App Engine Standard environment is not suitable for your use case. It is true that the
pytesseract
and thePillow
libraries can be installed viapip
. But these libraries require thetesseract-ocr
andlibtesseract-dev
platform packages to be installed, which don't come in the base runtime for App Engine Standard Python3.7 runtime. This is producing the error you are getting.The solution is to use Cloud Run, which will run your application in a Docker container and you will be able to customize your runtime. I have modified this Quickstart guide to run on Cloud Run a sample application that converts an image to text using
pytesseract
.My folder structure:
Here is the
Dockerfile
:The contents of
app.py
:The
requirements.txt
:Now to containerize and deploy your application just run:
gcloud builds submit --tag gcr.io/<PROJECT_ID>/helloworld
to build and submit the container to Container Registry.gcloud beta run deploy --image gcr.io/<PROJECT_ID>/helloworld --platform managed
to deploy the container to Cloud Run.