Floor Plan Text Recognition & OCR

The objective is to create bounding boxes using text recognition methods (eg: OpenCV) for US floor plan images, which can then be fed into a text reader (eg: LSTM or tesseract).

Several methods which have been tried cv2.findContours and cv2.boundingRect methods have been attempted but have largely failed to generalise to different types of floor plans (there is a wide deviation in how the floor plans look).

For example, cv2.findContours using grayscale, adaptive thresholds, erosion and dilation (with various iterations) before applying the cv2.findContours function results in the bellow. Note that Bedroom 2 and Kitchen are not being picked up correctly.

Additional example which fails to find any regions:

Any thoughts on text recognition models or cleaning procedures that will improve the accuracy of the text recognition model, preferably with code examples?

标签： python opencv ocr tesseract

1条回答

戒情不戒烟

2楼-- · 2020-05-29 18:13

This answer is based on the assumption that images are similar one to another (like their size, thickness of walls, letters...). If they are not this wouldn't be a good approach because you would have to change the thresholders for every image. That being said, I would try to transform the image to binary and search for contours. After that you can add criterion like height, weight etc. to filter out the walls. After that You can draw contours on a mask and then dilate the image. That will combine letters close to each other into one contour. Then you can create bounding box for all the contours which is your ROI. Then you can use any OCR on that region. Hope it helps a bit. Cheers!

Example:

import cv2
import numpy as np

img = cv2.imread('floor.png')
mask = np.zeros(img.shape, dtype=np.uint8)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, threshold = cv2.threshold(gray,150,255,cv2.THRESH_BINARY_INV)
_, contours, hierarchy = cv2.findContours(threshold,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)

ROI = []

for cnt in contours:
    x,y,w,h = cv2.boundingRect(cnt)
    if h < 20:
        cv2.drawContours(mask, [cnt], 0, (255,255,255), 1)

kernel = np.ones((7,7),np.uint8)
dilation = cv2.dilate(mask,kernel,iterations = 1)
gray_d = cv2.cvtColor(dilation, cv2.COLOR_BGR2GRAY)
_, threshold_d = cv2.threshold(gray_d,150,255,cv2.THRESH_BINARY)
_, contours_d, hierarchy = cv2.findContours(threshold_d,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)

for cnt in contours_d:
    x,y,w,h = cv2.boundingRect(cnt)
    if w > 35:
        cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)
        roi_c = img[y:y+h, x:x+w]
        ROI.append(roi_c)

cv2.imshow('img', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

Result:

0人赞添加讨论(0) 举报

Floor Plan Text Recognition & OCR

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间