minimum weight vertex cover of a tree

2020-03-04 08:33发布

问题:

There's an existing question dealing with trees where the weight of a vertex is its degree, but I'm interested in the case where the vertices can have arbitrary weights.

This isn't homework but it is one of the questions in the algorithm design manual, which I'm currently reading; an answer set gives the solution as

  1. Perform a DFS, at each step update Score[v][include], where v is a vertex and include is either true or false;
  2. If v is a leaf, set Score[v][false] = 0, Score[v][true] = wv, where wv is the weight of vertex v.
  3. During DFS, when moving up from the last child of the node v, update Score[v][include]: Score[v][false] = Sum for c in children(v) of Score[c][true] and Score[v][true] = wv + Sum for c in children(v) of min(Score[c][true]; Score[c][false])
  4. Extract actual cover by backtracking Score.

However, I can't actually translate that into something that works. (In response to the comment: what I've tried so far is drawing some smallish graphs with weights and running through the algorithm on paper, up until step four, where the "extract actual cover" part is not transparent.)

In response Ali's answer: So suppose I have this graph, with the vertices given by A etc. and the weights in parens after:

A(9)---B(3)---C(2) \ \ E(1) D(4)

The right answer is clearly {B,E}.

Going through this algorithm, we'd set values like so:

  • score[D][false] = 0; score[D][true] = 4
  • score[C][false] = 0; score[C][true] = 2
  • score[B][false] = 6; score[B][true] = 3
  • score[E][false] = 0; score[E][true] = 1
  • score[A][false] = 4; score[A][true] = 12

Ok, so, my question is basically, now what? Doing the simple thing and iterating through the score vector and deciding what's cheapest locally doesn't work; you only end up including B. Deciding based on the parent and alternating also doesn't work: consider the case where the weight of E is 1000; now the correct answer is {A,B}, and they're adjacent. Perhaps it is not supposed to be confusing, but frankly, I'm confused.

回答1:

It actually didnt mean any thing confusing and it is just Dynamic Programming, you seems to almost understand all the algorithm. If I want to make it any more clear, I have to say:

  1. first preform DFS on you graph and find leafs.
  2. for every leaf assign values as the algorithm says.
  3. now start from leafs and assign values to each leaf parent by that formula.
  4. start assigning values to parent of nodes that already have values until you reach the root of your graph.

That is just it, by backtracking in your algorithm it means that you assign value to each node that its child already have values. As I said above this kind of solving problem is called dynamic programming.

Edit just for explaining your changes in the question. As you you have the following graph and answer is clearly B,E but you though this algorithm just give you B and you are incorrect this algorithm give you B and E.

A(9)---B(3)---C(2) \ \ E(1) D(4)

score[D][false] = 0; score[D][true] = 4
score[C][false] = 0; score[C][true] = 2
score[B][false] = 6 this means we use C and D; score[B][true] = 3 this means we use B
score[E][false] = 0; score[E][true] = 1
score[A][false] = 4 This means we use B and E; score[A][true] = 12 this means we use B and A.

and you select 4 so you must use B and E. if it was just B your answer would be 3. but as you find it correctly your answer is 4 = 3 + 1 = B + E.

Also when E = 1000

A(9)---B(3)---C(2) \ \ E(1000) D(4)

it is 100% correct that the answer is B and A because it is wrong to use E just because you dont want to select adjacent nodes. with this algorithm you will find the answer is A and B and just by checking you can find it too. suppose this covers :

C D A = 15
C D E = 1006
A B = 12

Although the first two answer have no adjacent nodes but they are bigger than last answer that have adjacent nodes. so it is best to use A and B for cover.



回答2:

Backtracking is what's done in the previous steps. Step 4 has no special meaning, it could as well say "figure out what to do with Score".

The cover vertex of a tree allows to include alternated and adjacent vertices. It does not allow to exclude two adjacent vertices, because it must contain all of the edges.

The answer is given in the way the Score is recursively calculated. The cost of not including a vertex, is the cost of including its children. However, the cost of including a vertex is whatever is less costly, the cost of including its children or not including them, because both things are allowed.

As your solution suggests, it can be done with DFS in post-order, in a single pass. The trick is to include a vertex if the Score says it must be included, and include its children if it must be excluded, otherwise we'd be excluding two adjacent vertices.

Here's some pseudocode:

find_cover_vertex_of_minimum_weight(v)
  find_cover_vertex_of_minimum_weight(left children of v)
  find_cover_vertex_of_minimum_weight(right children of v)
  Score[v][false] = Sum for c in children(v) of Score[c][true]
  Score[v][true] = v weight + Sum for c in children(v) of min(Score[c][true]; Score[c][false])
  if Score[v][true] < Score[v][false] then
    add v to cover vertex tree
  else
    for c in children(v)
      add c to cover vertex tree