A friend asked me a question today about an assignment problem. I found a quite straightforward solution, but I feel that it can be made simpler and faster. Your help would be appreciated.
The problem: Assuming that I have N people, I need to assign them into M buildings, each building can house K people. Not all people are willing to live with each other, so i have a matrix of N*N cells and a 1 that marks the people that are willing to live with each other. If a cell contains 1 it means that I and J can live together. Obviously the matrix is symmetrical around the main diagonal.
My solution is as follows (pseudo Code):
int[] Match(int[] people, int[][] pairs, int numBuildings, int buildingsSize)
{
int[] freePeople = findFreePeople(people);
if(freePeople) = 0
{
return people;
}
foreach(int person in freePeople)
{
for(int buildingIndex=0 to numBuildings)
{
if( CheckIfPersonFitsInBuilding(...) )
{
int[] tempPeople = people.Copy();
tempPeople[person] = buildingIndex;
int[] result = Match(tempPeople,pairs,numBuildings,buildingsSize);
if(result != null)
{
return result;
}
}
}
}
return null;
}
I just cover all the possible arrangements using recursion. I feel this could be optimized greatly, and I'm not talking about heuristics but a solution with far lesser complexity.
- Is there a formal well known problem that is similar to this?
- Is there a better algorithm?
I think that this might be related to the stable marriage problem, though I'm not sure.
A nice way to solve these problems is to use a constraint solver for finite domains.
For instance, using GNU-Prolog:
From a complexity point of view, this continues to be NP-complete. But the constraint solver may reorder the way the assignments are performed on the labeling step so that dead ends are found early resulting in an smaller search tree.
This problem is known to be NP-hard by a reduction from the NP-complete problem of decomposing a graph into cliques (the clique cover problem).
First, some terminology and discussion. A clique in a graph is a set of k different nodes such that each node is connected to each other node in the clique. An independent set in the graph is a set of k different nodes such that no two nodes are connected to one another. If you take any graph G and then introduce edges whenever an edge is missing and remove all edges that previously existed, you get the complement graph of the original graph. This means that the problem of finding a clique in an initial graph is equivalent to finding an independent set in the complement graph.
The reason this is interesting is that in the problem you're describing, you are given a graph of people where each edge indicates "these two people can't be housed together." Consequently, you can think of the problem you're describing as trying to find a way to break the graph up into k subgraphs, each of which is an independent set. If you construct the complement of this graph, you get the graph of all people who are okay being placed together. In that case, you want to break the group up into k groups that are all cliques.
It is known that the following problem is NP-complete:
We can reduce this problem to your problem in polynomial time, showing that your problem is NP-hard. The construction is simple - given the initial graph G and number k, first construct the complement of G. This means that if you can break apart the complement into k independent sets, then the original graph G can be broken apart into k cliques (because of the duality between cliques and independent sets). Now, take this new graph and convert it into a set of people, one per node, where two people cannot be placed into the same room as one another if their nodes were connected in the original graph. Now, there is a way to distribute these people into k houses iff the complement of G can be broken down into k independent sets iff G can be broken down into k cliques. Consequently, the known NP-complete problem of clique cover can be reduced to your problem in polynomial time, so your problem is NP-hard. To ensure that any independent set can be fit into a house, just set the maximum capacity of each room to n, the number of nodes in the graph. This allows any independent set to be housed in any room.
Since your problem is NP-hard, there cannot be a polynomial-time solution to it unless P = NP. Consequently, as best as anyone knows, there is no polynomial time algorithm for it and your backtracking recursion very well might be the optimal solution.
Hope this helps!
templatetypedef gave a very nice proof of why the problem is NP-hard and has no (known) polynomial time solution.
However as with many NP-hard problems, that does not mean you cannot solve it efficiently in practice or that you cannot improve upon a brute force solution.
In particular, you should look into constraint satisfaction problems. They are a more general class of problems that describe precisely what you are trying to solve: Given a set of constraints, what are the values that satisfy all the constraints?
The book AIMA has a very nice chapter on how to solve these types of problems. I suggest you read up on that as there is a lot of good information there and it's very accessible as the book was designed for the undergraduate level. I'll give you some of the big ideas here:
The major questions are:
Here are two heuristics for this:
For the MRV heuristic, break ties by choosing the student that has the most constraints on the other students. The idea behind these heuristics is that you want to pick search paths that are most likely to succeed (LCV). But given a particular search path, you want to fail as early as possible to reduce the amount of time spent on that path (MRV).
Also, instead of naive recursion with basic forward checking, you should use a more advanced search algorithm like AC-3 that looks farther ahead.
Seeing as constraint satisfaction problems are so common in many software engineering applications, there are already a ton of open libraries out there that solve can solve them efficiently. Of course, this is given that the problem you're trying to solve is for a real-world application and not a homework assignment of sorts.