Algorithm for diameter of graph?

2019-01-23 08:19发布

站内文章 / 后端开发

45 0

贪生不怕死

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

If you have a graph, and need to find the diameter of it (which is the maximum distance between two nodes), how can you do it in O(log v * (v + e)) complexity.

Wikipedia says you can do this using Dijkstra's algorithm with a binary heap. But I don't understand how this works. Can someone explain please?

Or show a pseudocode?

回答1:

For a general Graph G=(V,E) there is no O(log V * (V + E)) time complexity algorithm known for computing the diameter. The current best solution is O(V*V*V), e.g., by computing all shortest Paths with Floyd Warshall's Algorithm. For sparse Graphs, i.e. when E is in o(N*N), Johnson's Algorithm gives you with O(V*V*log(V)+V*E) a better time complexity.

If your graph has certain properties like acyclic (Tree) you can get better.

So the bad news is, the Dijkstra won't be enough in the general case...

回答2:

I know I'm a year late to the thread, but I don't believe any of these answers are correct. OP mentioned in the comments that the edges are unweighted; in this case, the best algorithm runs in $O(n^{\omega}) \log n$ time (where $\omega$ is the exponent for matrix multiplication; currently upper bounded at $2.373$ by Virginia Williams).

The algorithm exploits the following property of unweighted graphs. Let $A$ be the adjacency matrix of the graph with an added self-loop for each node. Then $(A^k)_{ij}$ is nonzero iff $d(i, j) \le k$. We can use this fact to find the graph diameter by computing $\log n$ values of $A^k$.

Here's how the algorithm works: let $A$ be the adjacency matrix of the graph with an added self loop for each node. Set $M_0 = A$. While $M_k$ contains at least one zero, compute $M_{k+1} = M_{k}^2$.

Eventually, you find a matrix $M_{K}$ with all nonzero entries. We know that $K \le \log n$ by the property discussed above; therefore, we have performed matrix multiplication only $O(\log n)$ times so far. We can now continue by binary searching between $M_{K} = A^{2^K}$ and $M_{K-1} = A^{2^{K-1}}$. Set $B = M_{K-1}$ as follows.

Set DIAMETER = $2^{k-1}$. For $i = (K-2 \dots 0)$, perform the following test:

Multiply $B$ by $M_{i}$ and check the resulting matrix for zeroes. If we find any zeroes, then set $B$ to this matrix product, and add $2^i$ to DIAMETER. Otherwise, do nothing.

Finally, return DIAMETER.

As a minor implementation detail, it might be necessary to set all nonzero entries in a matrix to $1$ after each matrix multiplication is performed, so that the values don't get too large and unwieldy to write down in a small amount of time.

回答3:

Boost BGL has a small extended deque class named "rcm_queue" with which the eccentricity of a vertex can be found by a simple breadth first search, meaning a complexity of O(E).

http://www.boost.org/doc/libs/1_54_0/boost/graph/detail/sparse_ordering.hpp

As the diameter can be calculated by going over the eccentricity of all vertices, one can calculate the diameter of a graph with a complexity of O(V*E).

Especially for a very sparse graph with a deg(G) <<< V, I didn't find anything with better runtime.

I didn't look into the Floyd Warshall algorithm. I was just dealing with a graph with > 5.000.000 vertices but with a highest degree of any vertex of less than 15 and thought, this should probably outperform an algorithm with V*V*log(V) complexity.

EDIT

For sure, this only works with an undirected graph and not negative weighted (or even unweighted only? I'm not sure atm)

回答4:

As eci mentioned, one of the solutions is to use the Floyd-Warshall algorithm. If you need the code for it, a C++ version of it can be found here.

回答5:

First you will need to find the convex hull of the graph (finding it which is O(nh), where h is number of nodes on hull). The points of diameter will lie on the convex hull and thus the problem will reduce to finding the farthest points in h points. Hence, total order will be O(nh+h*h).

回答6:

Actually, if the graph is extremely large, you will need to use Dijkstra's algorithm in order to find the shortest distance. So it depends on how many nodes the OP's graph has.

回答7:

Assuming that the graph is unweighted then the following steps can find the solution in O(V * ( V + E )) where V is the number of vertices and E is the number of edges.

Let's define the distance between 2 vertices u, v to be the length of shortest path between u and v. Denote it by d(u, v) Define Eccentricity of a vertex v to be e(v) = max {d(u, v) | u ∈ V(G)} (V(G) is the set of vertices in graph G) Define Diameter(G) = max{e(v) | v ∈ V(G)}

Now to the algorithm:

1- For each vertex v in V(G) run BFS(v) and build a 2 dimensional array of distances from each vertex to others. (calculating the distance from each vertex to others is easy to do in the BFS algorithm)

2- Compute e(v) for each vertex from the 2 dimensional array created in step 1 3- compute diameter(G) by finding the maximum e(v) from step 2

Now to analyze this algorithm, it's easy to see that it's dominated by the first step which is of O (V * (V + E))

Read this part about diameter in Wikipedia

Another algorithm that runs in linear time O(V + E) 1- Run BFS on any random vertex v ∈ V(G) 2- Pick the vertex u with maximum distance 3- Run BFS again on that vertex u 4- Diameter is the maximum distance generated form step 3

回答8:

Graph description - undirected and unweighted, n nodes, m edges For the diameter of the graph, we need to calculate the shortest path between all pairs of nodes. Shortest paths between a source node to all other nodes can be calculated using the BFS algorithm for an undirected and unweighted graph. Time complexity is O(n + m) for 1 node. Here since we need to do a BFS for n nodes, the total time complexity of finding the shortest path between all pairs of nodes becomes O(n(n + m)). Since there are n(n-1)/2 pairs of nodes, so we have n(n-1)/2 values of the shortest path between all the nodes. For the diameter, we need to get the max of these values, which is again O(n*n). So the final time complexity becomes:

       O(n(n + m)) + O(n*n) = O(n(2n + m)) = O(n(n + m))

Pseudocode for getting the shortest paths between all pairs of nodes. We are using a matrix named Shortest_paths to store the shortest paths between any two pair of nodes. We are using a dictionary named flag which has values true or false which indicates whether the vertex has been visited or not. For each iteration of BFS, we initialize this dictionary to all false. We use a Queue Q to perform our BFS. Algorithm BFS(for all nodes)

Initialize a matrix named Shortest_paths[n][n] to all -1. 
    For each source vertex s
        For each vertex v
            Flag[v] = false
        Q = empty Queue
        Flag[s] = true
        enQueue(Q,s)
        Shortest_paths[s][s] = 0
        While Queue is not empty
            Do v = Dequeue(Q)
            For each w adjacent to v
                Do if flag[w] = false
                    Flag[w] = true
                    Shortest_paths[s][w] = Shortest_paths[s][v] + 1
                    enQueue(Q,w) 
    Diameter = max(Shortest_paths)
    Return Diameter