I'm using npm node-cloud-vision-api
API correctly detects language of the document but the results characters are returned in western character subset not corresponding to a locale. I assume they should be returned in UTF-8 characters but all the locale specific characters are mapped into basic western character subset.
For example:
Wartosc
is return insted of Wartość
How to instruct the API to return correct UTF-8 characters?
Have you tried passing in a language hint to the OCR Detection call. Please follow the below API Reference. https://cloud.google.com/vision/reference/rest/v1/images/annotate#AnnotateImageRequest
As written here it is a known issue with the Cloud Vision API when you don't use language hints.
You can see the actual bug report here.