The Problem
I was recently asked to calculate the money owed amongst a group of people who went on a trip together and came upon an interesting problem: given that you know the amounts that each person owes another, what is a general algorithm to consolidate the debts between people so that only the minimum number of payments needs to be made? Take this as an example:
- Mike owes John 100
- John owes Rachel 200
- Mike owes Rachel 400
We can remove a payment between Mike and John by reformulating the debts like this:
- Mike owes John 0
- John owes Rachel 100
- Mike owes Rachel 500
I did the math by hand since it was easy enough, but then the programmer in me was itching to figure out a general algorithm to do it for an arbitrarily large group. This seems like a graph algorithm to me, so I'll reformulate this as a graph:
Viewed as a Graph
- The vertices are the people in the group
- The edges are directed and weighted by the amount owed. For example, an edge from Mike to Rachel with weight 500 means that Mike owes Rachel 500.
- Constraint: the net sum of weights for each node must remain unchanged.
- The goal is to find a graph with the minimum number of edges that still satisfy the constraint.
It does not suffice to just figure out the receivers and givers. While I think this strategy is on the right track it also does not ensure an algorithm to find the least possible amount of payments.
For Example,
While it is obvious that this can be done in 3 pays (A and C to D, B to E). I can't think of an efficient algorithm that will satisfy this for all problem sets.
Better Example,
If we took the greedy approach of having person D pay to F we would wind up with a sub-optimal solution as opposed to the optimal(A&D to E, B&C to F).
This problem has a lot of similarity with the Bin Packing Problem which has been proven to be NP-hard. The only difference being that we have multiple bins of varying sizes and the condition that the total space in all bins is equal to the total size of all items. This leads me to believe that the problem is likely NP-hard but with the added constraints it may be possible to solve in polynomially time.
So, I implemented this for a spreadsheet to keep track of my roommates' debt to each other. I know this is really old, but I referenced it in my solution, and it's high on Google when searching for the subject, so I wanted to post and see if anyone has any input.
My solution uses the "central bank" or "netting center" concept people mentioned here. Before running this algorithm I calculate the net profit each person is owed, which is the sum of all their credits, minus the sum of all their debts. The complexity of calculating this is dependent on the number of transactions, not the number of individuals involved.
Fundamentally the point of the algorithm is to have every individual be paid or pay out the correct amount regardless of what individual they are transferring money to. I would ideally like to do this in the fewest number of payments, however it's difficult to prove this to be the case. Note all debits and credits will sum to zero.
I was very, very verbose for part of this code. In part to communicate what I was doing, and in part to solidify the logic I am using as I go forward. Apologies if it's unreadable.
Note:
and:
also:
The sum always being 0 after someone's debt is completely settled is the basis of the recursion logic. Someone is paid/has paid, the problem gets smaller.
If this algorithm is running for n people with non-zero debts (discard people who broke even before running the algorithm) this algorithm will give at most n-1 payments to settle the debt. It's unclear if it is always an ideal payment scheme (I've yet to find a counter example). I may try to prove if the number of transactions < n-1 then a debt and a credit must be exactly equal, which this algorithm accounts for I believe.
I'm extremely interested in any errors anyone sees in this. I haven't done development in a while, never mind algorithmics, and people will be paying each other based off this. I had fun and this is an interesting, meaty question, I hope some of you are still around.
As one @Edd Barret stated, this can be solved approximately using linear programming. I have written a blog post describing this approach, along with a tiny R package that implements it.
if A, B and C each owe $1 to each of D, E and F, the "list" or "central bank" solution creates five transactions (e.g. A,B,C -$3-> D, D -$3-> E,F) whereas the naive solution results in nine transactions. However, if A owes only to D, B only to E and C only to F the central bank solution creates still five transactions (A,B,C -$1-> D, D -$1-> E,F) whereas the best solution needs only three (A -$1-> D, B -$1-> E, C -$1-> F). This shows that the "list" or "central bank" solution is not optimal in general.
The following greedy algorithm can be used to create better solutions to the problem, but they are not always optimal. Let "debt[i,j]" denote the amount of money person i owes to person j; initially this array is initialized according to the situation.
They key of this algorithm is the observation that if both A and B owe money to both C and D, instead of the four transactions required for direct payments, B can pay the net debt to A who can take care of paying back both A's and B's loans.
To see how this algorithm works, consider the case where A, B and C each own $1 to each of D, E, F:
But in the case where A owes D, B owes E and C owes F, the algorithm falls through immediately to the payment loop, resulting in the optimal number of transactions (only three) instead of five transactions which would result from the "central bank" approach.
An example of non-optimality is where A owes to D and E, B owes to E and F and C owes to F and D (assume $1 for every debt). The algorithm fails to consolidate the loans because no two payers share two common payees. This could be fixed by changing ">= 2" on the second line to ">= 1", but then the algorithm would most likely become very sensitive to the order in which the debts are collateralized.
In the world of the corporate treasury, this is known as payment or settlement netting.
Multinational corporates usually have many flows between their subsidiaries every month, often in different currencies. They can save considerable amounts by optimising the settlement of these flows. Typically a corporate will perform such an optimisation (a netting cycle) once a month. When there are multiple currencies, there are three sources of savings:
There are two ways to actually calculate the optimised settlement.
Bilateral netting is the solution well described by @AndrewShepherd on this page. However, in a cross-border implementation, this approach can have legal and administrative problems implications since different borders are being crossed each month.
Multilateral netting solves the network by adding a new subsidiary called the netting centre and re-routes all amounts through it. Compare the before and after diagrams below:
Before netting
After netting
Although this adds one more flow than is necessary (compared to bi-lateral netting), the advantages are:
(At it's basic level, the calculation is simple, but there can be many legal and administrative complications so corporates frequently develop or purchase a netting system from a software vendor or service provider.)
My opinion: You're making this overly complicated.
Think of it as a "pool" of money, and lose the relationships altogether:
Instead of:
The algorithm only has to think:
Netting this:
Separate this into a list of "givers" and "receivers". Each giver on the list will go through the list of receivers, giving each receiver what they need until the giver has payed up. When a receiver receives everything they need, they go off the list.
Later Edit
As other posters have observed, this simplifies the problem. However, there might be an optimal ordering of the "givers" and "receivers" lists, but we haven't yet identified a straightforward way to determine this ordering.