I just tested the Google Cloud Vision API to read the text, if exist, in a image.
Until now I installed the Maven Server and the Redis Server. I just follow the instructions in this page.
https://github.com/GoogleCloudPlatform/cloud-vision/tree/master/java/text
Until now I was able to tested with .jpg files, is it possible to do it with tiff files or pdf??
I am using the following command:
java -cp target/text-1.0-SNAPSHOT-jar-with-dependencies.jar com.google.cloud.vision.samples.text.TextApp ../../data/text/
Inside the text directory, I have the files in jpg format.
Then to read the converted file, I don't know how to do that, just I run the following command
java -cp target/text-1.0-SNAPSHOT-jar-with-dependencies.jar com.google.cloud.vision.samples.text.TextApp
And I get the message to enter a word or phrase to search in the converted files. Is there a way to see the whole document transformed?
Thanks!
Unfortunately PDF and TIFF formats are not currently supported for Cloud Vision.
The accepted formats are : (taken from the the doc)
- JPEG
- PNG8
- PNG24
- GIF
- Animated GIF (first frame only)
- BMP
- WEBP
- RAW
- ICO
On April 6, 2018, support for PDF and TIFF files in document text detection was added to Google Cloud Vision API (see Release Notes).
According to documentation:
The Vision API can detect and transcribe text from PDF and TIFF
files stored in Google Cloud Storage.
Document text detection from PDF and TIFF must be requested using the
asyncBatchAnnotate function, which performs an asynchronous request and provides its status using the operations resources.
Output from a PDF/TIFF request is written to a JSON file created in the specified Google Cloud Storage bucket.
Example:
1) Upload a file to your Google Cloud Storage
2) Make a POST request to perform PDF/TIFF document text detection
Request:
POST https://vision.googleapis.com/v1p2beta1/files:asyncBatchAnnotate
Authorization: Bearer <your access token>
{
"requests":[
{
"inputConfig": {
"gcsSource": {
"uri": "gs://<your bucket name>/input.pdf"
},
"mimeType": "application/pdf"
},
"features": [
{
"type": "DOCUMENT_TEXT_DETECTION"
}
],
"outputConfig": {
"gcsDestination": {
"uri": "gs://<your bucket name>/output/"
},
"batchSize": 1
}
}
]
}
Response:
{
"name": "operations/9b1f9d773d216406"
}
3) Make a GET request to check if document text detection is done
Request:
GET https://vision.googleapis.com/v1/operations/9b1f9d773d216406
Authorization: Bearer <your access token>
Response:
{
"name": "operations/9b1f9d773d216406",
"metadata": {
"@type": "type.googleapis.com/google.cloud.vision.v1p2beta1.OperationMetadata",
"state": "RUNNING",
"updateTime": "2018-06-17T20:18:09.117787733Z"
},
"done": true,
"response": {
"@type": "type.googleapis.com/google.cloud.vision.v1p2beta1.AsyncBatchAnnotateFilesResponse",
"responses": [
{
"outputConfig": {
"gcsDestination": {
"uri": "gs://<your bucket name>/output/"
},
"batchSize": 1
}
}
]
}
}
4) Check the results in the specified Google Cloud Storage folder
https://cloud.google.com/vision/docs/pdf
I know this question is old, but now Google Vision released support for PDF!