I've read How to use the Google Vision API for text detection from base64 encoded image? but it doesn't help at all. Cloud client library is undesirable for me because I am doing many image processing (e.g. rotating, cropping, resizing, etc.) before and during OCR. Saving them as new files and re-read them as inputs of Google Vision API is rather inefficient.
Hence, I went check the documentation of posting requests directly:
- Using Python to send requests
- Base64 Encoding
- Optical character recognition (OCR),
and here are minimum codes to make the failure:
import base64
import requests
import io
# Read the image file and transform it into a base64 string
with io.open("photos/foo.jpg", 'rb') as image_file:
image = image_file.read()
content = base64.b64encode(image)
# Prepare the data for request
# Format copied from https://cloud.google.com/vision/docs/ocr
sending_request = {
"requests": [
{
"image": {
"content": content
},
"features": [
{
"type": "TEXT_DETECTION"
}
]
}
]
}
# Send the request and get the response
# Format copied from https://cloud.google.com/vision/docs/using-python
response = requests.post(
url='https://vision.googleapis.com/v1/images:annotate?key={}'.format(API_KEY),
data=sending_request,
headers={'Content-Type': 'application/json'}
)
# Then get 400 code
response
# <Response [400]>
print(response.text)
{
"error": {
"code": 400,
"message": "Invalid JSON payload received. Unexpected token.\nrequests=image&reque\n^",
"status": "INVALID_ARGUMENT"
}
}
I went to my console and see there are indeed request errors for google.cloud.vision.v1.ImageAnnotator.BatchAnnotateImages
, but I don't know what happened. Is it because the wrong format of sent data
in requests.post
?