I tried extracting articles from the newspaper image, but headings are being separated with rlsa algorithm horizontal and vertical of some pixel value in the first image. If I tried with more pixel value, articles are merging which is showed in second image. Can anyone suggest the best method to separate articles from the image in python and opencv?
This loop is for run-length-smoothing-algorithm-horizontal on the image
for i in range(1,a):
c = 1
for j in range(1, b):
if im_bw[i, j] == 0:
if (j-c) <= 10:
im_bw[i, c:j] = 0
c = j
if (b - c) <= 10:
im_bw[i, c:b] = 0
This loop is for run-length-smoothing-algorithm-vertical on the image
for i in range(1, b):
c = 1
for j in range(1, a):
if im_bw[j, i] == 0:
if (j-c) <= 9:
im_bw[c:j, i] = 0
c = j
if (b - c) <= 9:
im_bw[c:b, i] = 0
a is number of rows b is number of columns of an binary image
How algorithm worked on binary image and red mark shows the merging of articles