Community Detection in complete and weighted netwo

2019-05-29 05:39发布

问题:

I do have a complete network graph where every vertex is connected with each other and they only differ in form of their different weights. A example network would be: a trade network, where every country is connected with each other somehow and only differ in form of different trading volumina.

Now the question is how I could perform a community detection in that form of network. The usual suspects (algorithm) are only able to perform in either unweighted or incomplete networks well. The main problem is that the geodesic is everywhere the same.

Two option came into my mind:

  1. Cut the network into smaller pieces by cutting them at a certain "weight-threshold-level"
  2. Or use a hierarchical cluster algorithm to turn the whole network into a blockmodel. But I think the problem "no variance in geodesic terms" will remain.

回答1:

Several methods were suggested.

One simple yet effective method was suggested in Fast unfolding of communities in large networks (Blondel et al., 2008). It supports weighted networks. Quoting from the abstract:

We propose a simple method to extract the community structure of large networks. Our method is a heuristic method that is based on modularity optimization. It is shown to outperform all other known community detection method in terms of computation time. Moreover, the quality of the communities detected is very good, as measured by the so-called modularity.

Quoting from the paper:

We now introduce our algorithm that finds high modularity partitions of large networks in short time and that unfolds a complete hierarchical community structure for the network, thereby giving access to different resolutions of community detection.

So it supposed to work well for complete graph, but you should better check it.

A C++ implementation is available here (now maintained here).

Your other idea - using weight-threshold - may prove as a good pre-processing step, especially for algorithms which won't partition complete graphs. I believe it is best to set it to some percentile (e.g. to the median) of the weights.