How to find connected components in a matrix using

2019-05-12 09:12发布

Assume I have the following matrix (defined here in Julia language):

    mat = [1 1 0 0 0 ; 1 1 0 0 0 ; 0 0 0 0 1 ; 0 0 0 1 1]

Considering as a "component" a group of neighbour elements that have value '1', how to identify that this matrix has 2 components and which vertices compose each one?

For the matrix mat above I would like to find the following result:

Component 1 is composed by the following elements of the matrix (row,column):

    (1,1)
    (1,2)
    (2,1)
    (2,2)

Component 2 is composed by the following elements:

    (3,5)
    (4,4)
    (4,5)

I can use Graph algorithms like this to identify components in square matrices. However such algorithms can not be used for non-square matrices like the one I present here.

Any idea will be much appreciated.

I am open if your suggestion involves the use of a Python library + PyCall. Although I would prefer to use a pure Julia solution.

Regards

3条回答
beautiful°
2楼-- · 2019-05-12 09:40

Just got an answer from julia-users mailing list that solves this problem using Images.jl, a library to work with images in Julia.

They developed a function called "label_components" to identify connected components in matrices.

Then I use a customized function called "findMat" to get the indices of such matrix of components for each component.

The answer, in Julia language:

    using Images

    function findMat(mat,value)
        return(collect(zip(ind2sub(size(mat),find( x -> x == value, mat))...)));
    end

    mat = [1 1 0 0 0 ; 1 1 0 0 0 ; 0 0 0 0 1 ; 0 0 0 1 1]

    labels = label_components(mat);

    for c in 1:maximum(labels)
        comp = findMat(labels,c);
        println("Component $c is composed by the following elements (row,col)");       
        println("$comp\n"); 
    end
查看更多
手持菜刀,她持情操
3楼-- · 2019-05-12 09:43

The answer is pretty simple (though i can't provide python code):

  1. collect all 1s into a list
  2. select an arbitrary element of the list generated in step1 and use an arbitrary graph-traversal algorithm to traverse all neighbored 1s and remove visited 1s from the list generated in step 1
  3. repeat step2 until the list generated in step 1 is empty

In pseudocode (using BFS):

//generate a list with the position of all 1s in the matrix
list pos
for int x in [0 , matrix_width[
    for int y in [0 , matrix_height[
        if matrix[x][y] == 1
            add(pos , {x , y})

while NOT isempty(pos)
    //traverse the graph using BFS
    list visited
    list next

    add(next , remove(pos , 0))

    while NOT isempty(next)
        pair p = remove(next , 0)
        add(visited , p)
        remove(pos , p)

        //p is part of the specific graph that is processed in this BFS
        //each repetition of the outer while-loop process a different graph that is part 
        //of the matrix

        addall(next , distinct(visited , neighbour1s(p)))
查看更多
The star\"
4楼-- · 2019-05-12 09:52

Using Image.jl's label_components is indeed the easiest way to solve the core problem. However, your loop over 1:maximum(labels) may not be efficient: it's O(N*n), where N is the number of elements in labels and n the maximum, because you visit each element of labels n times.

You'd be much better off just visiting each element of labels just twice: once to determine the maximum, and once to assign each non-zero element to its proper group:

using Images

function collect_groups(labels)
    groups = [Int[] for i = 1:maximum(labels)]
    for (i,l) in enumerate(labels)
        if l != 0
            push!(groups[l], i)
        end
    end
    groups
end

mat = [1 1 0 0 0 ; 1 1 0 0 0 ; 0 0 0 0 1 ; 0 0 0 1 1]

labels = label_components(mat)
groups = collect_groups(labels)

Output on your test matrix:

2-element Array{Array{Int64,1},1}:
 [1,2,5,6] 
 [16,19,20]

Calling library functions like find can occasionally be useful, but it's also a habit from slower languages that's worth leaving behind. In julia, you can write your own loops and they will be fast; better yet, often the resulting algorithm is much easier to understand. collect(zip(ind2sub(size(mat),find( x -> x == value, mat))...)) does not exactly roll off the tongue.

查看更多
登录 后发表回答