I just tested the Google Cloud Vision API to read the text, if exist, in a image.
Until now I installed the Maven Server and the Redis Server. I just follow the instructions in this page.
https://github.com/GoogleCloudPlatform/cloud-vision/tree/master/java/text
Until now I was able to tested with .jpg files, is it possible to do it with tiff files or pdf??
I am using the following command:
java -cp target/text-1.0-SNAPSHOT-jar-with-dependencies.jar com.google.cloud.vision.samples.text.TextApp ../../data/text/
Inside the text directory, I have the files in jpg format.
Then to read the converted file, I don't know how to do that, just I run the following command
java -cp target/text-1.0-SNAPSHOT-jar-with-dependencies.jar com.google.cloud.vision.samples.text.TextApp
And I get the message to enter a word or phrase to search in the converted files. Is there a way to see the whole document transformed?
Thanks!
On April 6, 2018, support for PDF and TIFF files in document text detection was added to Google Cloud Vision API (see Release Notes).
According to documentation:
The Vision API can detect and transcribe text from PDF and TIFF files stored in Google Cloud Storage.
Document text detection from PDF and TIFF must be requested using the asyncBatchAnnotate function, which performs an asynchronous request and provides its status using the operations resources.
Output from a PDF/TIFF request is written to a JSON file created in the specified Google Cloud Storage bucket.
Example:
1) Upload a file to your Google Cloud Storage
2) Make a POST request to perform PDF/TIFF document text detection
Request:
Response:
3) Make a GET request to check if document text detection is done
Request:
Response:
4) Check the results in the specified Google Cloud Storage folder
https://cloud.google.com/vision/docs/pdf
I know this question is old, but now Google Vision released support for PDF!
Unfortunately PDF and TIFF formats are not currently supported for Cloud Vision.
The accepted formats are : (taken from the the doc)