I am trying to get the coordinates or positions of text character from an Image using Tesseract. I want to know the exact pixel position, so that i can click that text using some other tool.
Edit :
import pytesseract
from pytesseract import pytesseract
import PIL
from PIL import Image
import cv2
import csv
img = 'E:\\OCR-DATA\\sample.jpg'
imge = Image.open(img)
data=pytesseract.image_to_string(imge,lang='eng',boxes=True,config='hocr')
print(data)
data
contains recognized text with box boundary value. But i am not sure , how to use that boundary value to get the co-ordinates of the text.
Value of the data
variable is as follows:
O 100 356 115 373 0
u 117 356 127 368 0
t 130 356 138 372 0
p 141 351 152 368 0
u 154 356 164 368 0
t 167 356 175 371 0
You have the coordinates of the bounding box in every line.
From: Training Tesseract – Make Box Files
character, left, bottom, right, top, page
So for each character you get the character, followed by its bounding box characters, followed by the 0-based page number.
you can try This:
This will directly give you the result Like: