I have a large numpy 2d array (10000,10000) in which regions (clusters of cells with the same number) are randomly labeled. As a result, some separate regions were assigned to the same label. What I would like is to relabel the numpy 2d array so that all separate regions are assigned to a unique label (see example).
I now how to solve this problem with a loop. But as I am working with a large array with a lot of small regions, this process takes ages. Therefore, a vectorized approach would be more suitable.
Example:
-Two separate regions are labeled with 1
-Two separate regions are
labeled with 3
## Input
random_arr=np.array([[1,1,3,3],[1,2,2,3],[2,2,1,1],[3,3,3,1]])
## Apply function
unique_arr=relabel_regions(random_arr)
## Output
>>> unique_arr
array([[1, 1, 3, 3],
[1, 2, 2, 3],
[2, 2, 4, 4],
[5, 5, 5, 4]])
Slow solution with loop:
def relabel_regions(random_regions):
# Locate random regions index
random_labs=np.unique(random_regions)
unique_segments=np.zeros(np.shape(random_regions),dtype='uint64')
count=0
kernel=np.array([[0,1,0],[1,1,1],[0,1,0]],dtype='uint8')
# Assign unique number to each random labeled region
for i in range(len(random_labs)):
mask=np.zeros(np.shape(random_regions))
mask[np.where(random_regions==random_labs[i])]=1
labeled_mask, freq = ndimage.label(mask, structure=kernel)
labeled_mask=labeled_mask+count
unique_segments[np.where(labeled_mask>0+count)]=labeled_mask[np.where(labeled_mask>0+count)]
count+=freq
return unique_segments