Finding kth-shortest paths?

Finding the shortest path between two points in a graph is a classic algorithms question with many good answers (Dijkstra's algorithm, Bellman-Ford, etc.) My question is whether there is an efficient algorithm that, given a directed, weighted graph, a pair of nodes s and t, and a value k, finds the kth-shortest path between s and t. In the event that there are multiple paths of the same length that all tie for the kth-shortest, it's fine for the algorithm to return any of them.

I suspect that this algorithm can probably be done in polynomial time, though I'm aware that there might be a reduction from the longest path problem that would make it NP-hard.

Does anyone know of such an algorithm, or of a reduction that show that it is NP-hard?

回答1:

The best (and basically optimal) algorithm is due to Eppstein.

回答2:

You're looking for Yen's algorithm for finding K shortest paths. The kth shortest path will then be the last path in that set.

Here's an implementation of Yen's algorithm.

And here's the original paper describing it.

回答3:

From the available Kth shortest path algorithms, Yen’s algorithm is the easiest to understand. Mostly this is because Yen’s algorithm first needs to compute all K-1 shortest paths before it can compute the Kth shortest path, so it can be broken down into sub-problems.

Furthermore, since each sub-problem uses a standard shortest path algorithm (e.g. Dijkstra’s algorithm), it is a more natural extension from the 1st shortest path problem to the Kth shortest path problem.

Here is how it works for finding the 3rd shortest path in an example graph with equal-weight edges.

1st shortest path (K=1)

If we are looking for the 1st shortest path between a start and a destination (here, between D and F), we can just run Dijkstra’s algorithm. The entire code for Yen’s algorithm at the first iteration is:

shortest_1 = Dijkstra(graph, D, F)

Given a starting graph, this gives the 1st shortest path (K=1).

2nd shortest path (K=2)

The intuition for finding the 2nd shortest path is to take the 1st shortest path but “force” Dijkstra’s algorithm along a different, slightly less optimal route. The way Yen’s algorithm “forces” Dijkstra’s algorithm along a different route, is by removing one of the edges that are part of the 1st shortest path.

But which of the edges do we remove to get the next shortest path? We need to try removing each edge, one by one, and see which edge removal gives us the next shortest path.

First we try to remove edge D-E (the first edge in shortest_1), and then complete the shortest path by running Dijkstra(graph_1, D, F). We combine the shortest path already known from node D to D (which is just the node D itself), with the new shortest path from node D to F. This gives us an alternative path D->F.

Then we try to remove the edge E-F (the second edge in shortest_1), and then complete the shortest path by running Dijkstra(graph_2, E, F). We combine the shortest path already known from node D to E, with the new shortest path from node E to F. This gives us yet another alternative path D->F.

The procedure for the 2nd iteration thus looks like this:

// Try without edge 1
graph_1 = remove_edge(graph, D-E)
candidate_1 = shortest_1(D, D) + Dijkstra(graph_1, D, F)

// Try without edge 2
graph_2 = remove_edge(graph, E-F)
candidate_2 = shortest_1(D, E) + Dijkstra(graph_2, E, F)

shortest_2 = pick_shortest(candidate_1, candidate_2)

At the end, the shortest of the alternative new paths is chosen as the 2nd shortest path.

3rd shortest path (K=3)

Just as the 2nd shortest path was found by removing edges from the 1st shortest path, the 3rd shortest path is found by removing edges from the 2nd shortest path.

The new nuance this time, however, is that for when we remove edge D-A (the first edge in shortest_2), we also want to remove edge D-E. If we don’t do this, and remove only the edge D-A, then when we run Dijkstra on graph_3, we will find the 1st shortest path again (D-E-F), instead of the 3rd shortest path!

For graph_4 and graph_5, however, we do not need to remove any other edges, since those edges, when used, won’t give us previously seen shortest paths.

Thus, the overall procedure looks similar to finding the 2nd shortest path, but with the nuance that we also want to remove some edges seen in the 1st shortest path in addition to the 2nd shortest path. The decision is made based on whether shortest_1 and shortest_2 share a subpath leading up to the edge which is being removed.

// Try without edge 1
edges_3 = [D-A]
if shortest_1 and shortest_2 share subpath up to node D: 
    // True because both shortest_1 and shortest_2 have D in shortest path
    // D-E is edge in shortest_1 that is connected from D, and so it is added
    edges_3.add(D-E)

graph_3 = remove_edges(graph, edges_3)
candidate_3 = shortest_e(D, D) + Dijkstra(graph_3, D, F) // returns infinity


// Try without edge 2
edges_4 = [A-E]
if shortest_1 and shortest_2 share subpath up to node A:
    // False because there are no edges in shortest_1 that go through A
    // So no edges added

graph_4 = remove_edges(graph, edges_4)
candidate_4 = shortest_2(D, A) + Dijkstra(graph_4, A, F) // returns infinity


// Try without edge 3
edges_5 = [E-F]
if shortest_1 and shortest_2 share subpath up to node E:
    // False because shortest_1 goes through D-E while shortest_2 goes through D-A-E
    // So no edges added

graph_5 = remove_edges(graph, edges_5)
candidate_5 = shortest_2(D, E) + Dijkstra(graph_5, E, F)


shortest_3 = pick_shortest(candidate_2, candidate_3, candidate_4, candidate_5)

Note how when we pick the shortest path this time, we take into account unused candidates from iteration 2 (i.e. candidate_2), and actually end up picking the shortest path found from graph_2. In the same way, on the next iteration (K=4), we will find that the 4th shortest path was actually found in iteration K=3. As you can see, this algorithm is doing work in advance, so while it is finding the Kth shortest path, it is also exploring some of the paths beyond the Kth shortest path. It is thus not the most optimal algorithm for the Kth shortest path problem. The Eppstein algorithm can do better, but it is more complex.

The above approach can be generalized by using several nested loops. The Wikipedia page on Yen’s algorithm already gives excellent pseudocode for the more generic implementation, so I will refrain from writing it here. Note that the Wikipedia code uses a list A to hold each shortest path, and a list B to hold each candidate path, and that candidate shortest paths persist across loop iterations.

Remarks

Due to the use of Dijkstra’s algorithm, Yen’s algorithm cannot have a Kth shortest path that contains a loop. This is not as noticable when un-weighted edges are used (as in the example above), but could occur if weights are added. For this reason, Eppstein’s algorithm works better as well, since it considers loops. This other answer includes a link to a good explanation of Eppstein’s algorithm.

回答4:

Even though, the question has a valid accepted answer, this answer addresses the problem of implementation by providing a sample working code. Find the valid code for kth shortest path here:

// Time complexity: O(Vk(V*logV + E))

using namespace std;

typedef long long int ll;
typedef short int i16;
typedef unsigned long long int u64;
typedef unsigned int u32;
typedef unsigned short int u16;
typedef unsigned char u8;

const int N = 128;

struct edge
{
    int to,w;
    edge(){}
    edge(int a, int b)
    {
        to=a;
        w=b;
    }
};

struct el
{
    int vertex,cost;
    el(){}
    el(int a, int b)
    {
        vertex=a;
        cost=b;
    }
    bool operator<(const el &a) const
    {
        return cost>a.cost;
    }
};

priority_queue <el> pq;

vector <edge> v[N];

vector <int> dist[N];

int n,m,q;

void input()
{
    int i,a,b,c;
    for(i=0;i<N;i++)
        v[i].clear();
    for(i=1;i<=m;i++)
    {
        scanf("%d %d %d", &a, &b, &c);
        a++;
        b++;
        v[a].push_back(edge(b,c));
        v[b].push_back(edge(a,c));
    }
}

void Dijkstra(int starting_node, int ending_node)
{
    int i,current_distance;
    el curr;
    for(i=0;i<N;i++)
        dist[i].clear();
    while(!pq.empty())
        pq.pop();
    pq.push(el(starting_node,0));
    while(!pq.empty())
    {
        curr=pq.top();
        pq.pop();
        if(dist[curr.vertex].size()==0)
            dist[curr.vertex].push_back(curr.cost);
        else if(dist[curr.vertex].back()!=curr.cost)
            dist[curr.vertex].push_back(curr.cost);
        if(dist[curr.vertex].size()>2)
            continue;
        for(i=0;i<v[curr.vertex].size();i++)
        {
            if(dist[v[curr.vertex][i].to].size()==2)
                continue;
            current_distance=v[curr.vertex][i].w+curr.cost;
            pq.push(el(v[curr.vertex][i].to,current_distance));
        }
    }
    if(dist[ending_node].size()<2)
        printf("?\n");
    else
        printf("%d\n", dist[ending_node][1]);
}

void solve()
{
    int i,a,b;
    scanf("%d", &q);
    for(i=1;i<=q;i++)
    {
        scanf("%d %d", &a, &b);
        a++;
        b++;
        Dijkstra(a,b);
    }
}

int main()
{
    int i;
    for(i=1;scanf("%d %d", &n, &m)!=EOF;i++)
    {
        input();
        printf("Set #%d\n", i);
        solve();
    }
    return 0;
}