Algorithm to simplify a weighted directed graph of

2019-01-14 08:14发布

问题:

I've been using a little python script I wrote to manage debt amongst my roommates. It works, but there are some missing features, one of which is simplifying unnecessarily complicated debt structures. For example, if the following weighted directed graph represents some people and the arrows represent debts between them (Alice owes Bob $20 and Charlie $5, Bob owes Charlie $10, etc.):

It is clear that this graph should be simplified to the following graph:

There's no sense in $10 making its way from Alice to Bob and then from Bob to Charlie if Alice could just give it to Charlie directly.

The goal, then, in the general case is to take a debt graph and simplify it (i.e. produce a new graph with the same nodes but different edges) such that

  1. No node has edges pointing both in and out of it (no useless money changing hands)
  2. All nodes have the same "flow" through them as they did in the original graph (it is identical in terms of where the money ends up).

By "flow", I mean the value of all inputs minus all outputs (is there a technical term for this? I am no graph theory expert). So in the example above, the flow values for each node are:

  • Bob: +10
  • Alice: -25
  • Charlie: +15

You can see that the first and second graphs have the same flow through each node, so this is a good solution. There are some other easy cases, for example, any cycle can be simplified by removing the lowest valued edge and subtracting its value from all other edges.

This:

should be simplified into this:

I can't imagine that no one has studied this problem; I just don't know what terms to search for to find info on it (again, not a graph theory expert). I've been looking for several hours to no avail, so my question is this: what is an algorithm that will produce a simplification (new graph) according to the conditions specified above for any weighted directed graph?

回答1:

Simple algorithm

You can find in O(n) how much money who is expecting to get or pay. So you could simply create two lists, one for debit and the other for credit, and then balance the head of the two lists until they are empty. From your first example:

  • Initial state: Debit: (A: 25), Credit: (B: 15, C: 10)
  • First transaction, A:15 -> B: Debit: (A: 10), Credit: (C: 10)
  • Second transaction, A:10 -> C: Debit: (), Credit: ()

The transactions define the edges of your graph. For n persons involved, there will be at most n-1 transactions=edges. In the beginning, the total length of both lists is n. In each step, at least one of the lists (debit/credit) gets shorter by one, and in the last both lists disappear at once.

The issue is that, in general, this graph doesn't have to be similar to the original graph, which, as I get your intention, is a requirement. (Is it? There are cases where the optimal solution consists of adding new edges. Imagine A owing B and B owing C the same amount of money, A should pay C directly but this edge is not in the graph of debts.)

Less transactions

If the goal is just to construct an equivalent graph, you could search the creditor and debitor lists (as in the section above) for exact matches, or for cases where the sum of credit matches the debit of one person (or the other way round). Look for bin packing. For other cases you will have no other choice than splitting the flows, but even the simple algorithm above produces a graph which has one fewer edge than there are persons involved -- at most.

EDIT: Thanks to j_random_hacker for pointing out that a solution with less than n-1 edges is possible iff there is a group of persons whose total debts matches the credit of another group of persons: Then, the problem can be split into two subproblems with a total cost of n-2 edges for the transaction graph. Unfortunately, the subset sum problem is NP-hard.

A flow problem?

Perhaps this also can be transformed to a min-cost flow problem. If you want just to simplify your original graph, you construct a flow on it, the edge capacities are the original amounts of debit/credit. The debitors serve as inflow nodes (through a connector node which serves all debitors with edges of capacity that equals their total debt), the creditors are used as outflow nodes (with a similar connector node).

If you want to minimize the number of transactions, you will prefer keeping the "big" transactions and reducing the "small" ones. Hence, the cost of each edge could be modeled as the inverse of the flow on that edge.



回答2:

Here is an academic paper which investigates this problem in great detail. There is also some sample code for the different algorithms in Section 8 towards the end.

Verhoeff, T. (2004). Settling multiple debts efficiently : an invitation to computing science. Informatics in Education, 3(1), 105-126.



回答3:

I've actually encountered this problem in exactly the same situation as you :)

I think krlmlr's various solution don't quite solve the problem exactly. I'll have a think about how to solve it exactly (in the minimum-edges sense), but in the meantime, a practical alternative solution to your problem is to invent a new person, Steve:

  1. Steve is not actually a person. Steve is just a bucket, with a piece of paper attached to it.
  2. Everyone calculates the net amount that they owe (or are owed, if negative), and writes it on the piece of paper, alongside their name.
  3. Anyone whose net position is that they owe money gives that net amount of money to Steve when they can, and crosses off their name.
  4. Everyone whose net position is that they are owed money takes that money from Steve when they see Steve has it, and crosses off their name.

If a person who owes money can't pay all of it at once, they can just give Steve what they can currently afford, and take this amount off the total-owing figure against their name. Likewise if you are owed more money than Steve currently has on hand, you can take all of the money he currently has, and take that amount off the total-owed against your name.

If everyone agrees at the start to pay Steve only the full amount, then every net-ower makes exactly one deposit, and every net-owed person make exactly one withdrawal (although this may require multiple checks on Steve to see whether he currently has sufficient cash on hand). The good thing about Steve is that he's always around, and is never too busy to sort out finances. Unfortunately he's very gullible, so Alice, Bob and Charlie need to already trust one another not to take advantage of him.