I am solving a scheduling task in SWI Prolog using the CLPFD library. Since it is the first time I solve something more serious than was the sendmory I probably need some good advices from more experienced users. Let me briefly describe the domain/the task.
Domain
I have a "calendar" for a month. Everyday there are 2 for the whole day, 2 for the whole night (long 12h service). There are also, only Mon-Fri 10 more workers for 8 hours (short service).
The domain constraints are, obviously:
- There are no consecutive services (no day after night and vice versa, no short day service after night)
- On worker can serve up to 2 consecutive night services in a row
- Each worker has a limited amount of hours for a month
- There are 19 workers available
My approach is as following:
Variables
For every field in the calendar I have a variable defined:
DxD_y
wherex
is number of the day andy
is 1 or 2 for the long day serviceDxN_y
wherex
is number of the day andy
is 1 or 2 for the long night serviceDxA_y
wherex
is number of the day andy
is 0 .. 9 for the short day serviceSUM_x
wherex
is a worker number (1..19) denoting sum of hours for a worker
Each of the D
variables has a domain 1..19
. To simplify it for now, SUM_X #=< 200
for each X
.
Constraints
all_distinct()
for each variable for the same day - each worker can serve only one service/dayglobal_cardinality()
to count number of occurrences for each number 1..19 for list with short services and long services - this defines variablesLSUM_X
andSSUM_X
- number of occurrences of workerX
inL
ong orS
hort servicesSUM_X #= 12*LSUM_X + 8*SSUM_X
for each workerDxN_y #\= Dx+1D_z
- to avoid long day service after a night one- bunch of similar constraints like the one above to cover all the domain constraints
DxNy #= Dx+1Ny #==> DxNy #\= Dx+2Ny
- to avoid three consecutive night services, there are constraints each combination ofx
andy
Notes
All the variables and constraints are directly stated in the pl script. I do not use prolog predicates to generate constraint - because I have a model in a .NET application (frontend) and I can easily generate all the stuff from the .NET code into a prolog code.
I think my approach is overall good. Running the scheduler on some smaller example works well (7 days, 4 long services, 1 short service, 8 workers). Also I was able to get some valid results on the full blown case - 30 days, 19 workers, 4 long and 10 short services per day.
However, I am not completely satisfied with the current status. Let me explain why.
Questions
- I read some articles about modelling scheduling problems and some of them uses a bit different approach - introducing only boolean variables for each combination of my variable (calendar field) and worker to flag if the worker is assigned to a particular calendar field. Is this a better approach?
- If you calculate the overall amount-of-work limits and overall hour in calendar you find out that the workers are not 100% utilized. But the solver creates the solution most likely in this way:
utilize the first worker for 100% and then grab the next one
. So the SUMs in the solution appears like[200,200,200...200,160,140,80,50,0,]
. I would be glad if the workers will be more or less equally utilized. Is there some simple/efficient way how to achieve that? I considered defining somewhat like define the differences between workers and minimize it, but it sound very complex for me and I am afraid I would take ages to compute that. I uselabeling([random_variable(29)], Vars)
, but it only reorders the variables, so there are still these gaps, only in different order. Probably I want thelabeling
procedure will take the values in some other order thanup
ordown
(in some pseudo-random way). - How should I order the constraints? I think the order of constraints matters with respect to efficiency of the labeling.
- How to debug/optimize performance of labeling? I hoped solving this type of task will take some seconds or maximally a couple of minutes in case very tight conditions for sums. For example labeling with the the
bisect
option took ages.
I can provide some more code examples if needed.
That's a lot of questions, let me try to address some.
This is typically done when a MILP (Mixed Integer Linear Programming) solver is used, where higher-level concepts (such as
alldifferent
etc) have to be expressed as linear inequalities. Such formulations then usually require lots of boolean variables. Constraint Programming is more flexible here and offers more modelling choices, but unfortunately there is no simple answer, it depends on the problem. Your choice of variables influences both how hard it is to express your problem constraints, and how efficiently it solves.You already mention the idea of minimizing differences, and this is how such a balancing requirement would usually be implemented. It does not need to be complicated. If originally we have this unbalanced first solution:
then simply minimizing the maximum of list elements will already give you (in combination with the sum-constraint) a balanced solution:
You could also minimize the difference between minimum and maximum:
or even the sum of squares. [Sorry, my examples are for ECLiPSe rather than SWI/clpfd, but should show the general idea.]
You should not worry about this. While it might have some influence, it is too unpredictable and depends too much on implementation details to allow for any general recommendations. This is really the job of the solver implementer.
For realistic problems, you will often need (a) a problem-specific labeling heuristic, and (b) some variety of incomplete search. Visualization of the search tree or the search progress can help with tailoring heuristics. You can find some discussion of these issues in chapter 6 of this online course.