Constraint Programming: Scheduling with multiple w

2019-05-09 18:58发布

I'm new to constraint programming. I imagine this is an easy problem but I can't wrap my head around it. Here's the problem:

  • We have multiple machines (N), each with a limited resource (let's say memory, and it can be the same for all machines.)
  • We have T tasks, each with a duration and each requiring some amount of the resource.
  • A machine can work on multiple tasks at the same time as long as its resource isn't exceeded.
  • A task cannot be split among machines and it has to be done in one shot (i.e. no pausing).

How do we assign the tasks to the machines to minimize the end-time or the number of machines used?

It seems like I should be able to achieve this with the cumulative predicate but it seems like it's limited to scheduling one set of tasks to one worker with a limited global resource rather than a variable number of workers.

I'm just learning CP & MiniZinc. Any ideas on how to generalize cumulative? Alternatively, is there an existing MiniZinc model I can understand that does something like this (or close enough?)

Thanks,

PS: I don't have any concrete data since this is a hypothetical/learning exercise for the most part. Imagine you have 10 machines and 10 tasks with various durations (in hours): 2,4,6,5,2,1,4,6,3,2,12 with memory requirements (GBs): 1,2,4,2,1,8,12,4,1,10. Each machine has 32 GBs of ram.

2条回答
Viruses.
2楼-- · 2019-05-09 19:27

This is an old question, but here is a CP Optimizer model for this problem (in Python). In this version, I minimize a lexicographical objective : first minimize the makespan (optimal value is 12), then given this makespan, minimize the number of used machines (here, one can execute all the tasks on a single machine and still finish at 12).

DUR = [2,4,6,5,2,1,4,6,3,12]
MEM = [1,2,4,2,1,8,12,4,1,10]
CAP = 32
TASKS    = range(len(DUR))
MACHINES = range(10)

from docplex.cp.model import *
model = CpoModel()

# Decision variables: tasks and alloc
task  = [interval_var(size=DUR[i]) for i in TASKS]
alloc = [ [interval_var(optional=True) for j in MACHINES] for i in TASKS]

# Objective terms
makespan  = max(end_of(task[i]) for i in TASKS)
nmachines = sum(max(presence_of(alloc[i][j]) for i in TASKS) for j in MACHINES)

# Objective: minimize makespan, then number of machine used
model.add(minimize_static_lex([makespan, nmachines])) 

# Allocation of tasks to machines
model.add([alternative(task[i], [alloc[i][j] for j in MACHINES]) for i in TASKS])

# Machine capacity
model.add([sum(pulse(alloc[i][j],MEM[i]) for i in TASKS) <= CAP  for j in MACHINES])

# Resolution
sol = model.solve(trace_log=True)

# Display solution
for i in TASKS:
    for j in MACHINES:
        s = sol.get_var_solution(alloc[i][j])
        if s.is_present():
            print('Task ' + str(i) + ' scheduled on machine ' + str(j) + ' on [' + str(s.get_start()) + ',' +  str(s.get_end()) + ')')

And the result is:

! ----------------------------------------------------------------------------
! Minimization problem - 110 variables, 20 constraints
! Initial process time : 0.00s (0.00s extraction + 0.00s propagation)
!  . Log search space  : 66.4 (before), 66.4 (after)
!  . Memory usage      : 897.0 kB (before), 897.0 kB (after)
! Using parallel search with 8 workers.
! ----------------------------------------------------------------------------
!          Best Branches  Non-fixed    W       Branch decision
                    0        110                 -
+ New bound is 12; 0
*            12      111  0.01s        1      (gap is 100.0% @ crit. 2 of 2)
  New objective is 12; 7
*            12      131  0.01s        1      (gap is 100.0% @ crit. 2 of 2)
  New objective is 12; 6
*            12      151  0.01s        1      (gap is 100.0% @ crit. 2 of 2)
  New objective is 12; 5
*            12      171  0.01s        1      (gap is 100.0% @ crit. 2 of 2)
  New objective is 12; 4
*            12      191  0.01s        1      (gap is 100.0% @ crit. 2 of 2)
  New objective is 12; 3
*            12      211  0.01s        1      (gap is 100.0% @ crit. 2 of 2)
  New objective is 12; 2
*            12      231  0.01s        1      (gap is 100.0% @ crit. 2 of 2)
  New objective is 12; 1
! ----------------------------------------------------------------------------
! Search completed, 7 solutions found.
! Best objective         : 12; 1 (optimal)
! Best bound             : 12; 1
! ----------------------------------------------------------------------------
! Number of branches     : 1318
! Number of fails        : 40
! Total memory usage     : 6.7 MB (6.6 MB CP Optimizer + 0.1 MB Concert)
! Time spent in solve    : 0.00s (0.00s engine + 0.00s extraction)
! Search speed (br. / s) : 131800.0
! ----------------------------------------------------------------------------

Task 0 scheduled on machine 4 on [4,6)
Task 1 scheduled on machine 4 on [4,8)
Task 2 scheduled on machine 4 on [1,7)
Task 3 scheduled on machine 4 on [0,5)
Task 4 scheduled on machine 4 on [4,6)
Task 5 scheduled on machine 4 on [0,1)
Task 6 scheduled on machine 4 on [0,4)
Task 7 scheduled on machine 4 on [1,7)
Task 8 scheduled on machine 4 on [4,7)
Task 9 scheduled on machine 4 on [0,12)
查看更多
forever°为你锁心
3楼-- · 2019-05-09 19:49

Here's a model that seems to be correct. However, it don't use "cumulative" at all since I wanted to visualize as much as possible (see below).

The main idea is that - for each time step, 1..max_step - each machine must have only tasks <= 32 GB. The foreach loop checks - for each machine - that the sum of memory of each task that is active at that time on that machine is below 32GB.

The output section shows the solution in different ways. See comments below.

The model is a slightly edited version of http://hakank.org/minizinc/scheduling_with_multiple_workers.mzn

Update: I should also mention that this model allows for different size of RAM on the machines, e.g. some machines have 64GB and some 32GB. This is demonstrated - but commented - in the model on my site. Since the model use value_precede_chain/2 - which ensures that the machines are used in order - it's recommended that the machines are ordered of decreasing size of RAM (so the bigger machines are used first).

(Also, I've modeled the problem in Picat: http://hakank.org/picat/scheduling_with_multiple_workers.pi )

include "globals.mzn"; 
int: num_tasks = 10; 
int: num_machines = 10;

array[1..num_tasks] of int: duration = [2,4,6,5,2,1,4,6,3,12]; % duration of tasks

array[1..num_tasks] of int: memory   = [1,2,4,2,1,8,12,4,1,10];  % RAM requirements (GB)
int: max_time = 30; % max allowed time

% RAM for each machine (GB)
array[1..num_machines] of int: machines_memory = [32 | i in  1..num_machines];


% decision variables
array[1..num_tasks] of var 1..max_time: start_time; % start time for each task
array[1..num_tasks] of var 1..max_time: end_time;   % end time for each task
array[1..num_tasks] of var 1..num_machines: machine; % which machine to use
array[1..num_machines,1..max_time] of var 0..max(machines_memory): machine_used_ram;

var 1..num_machines: machines_used = max(machine);
var 1..max_time: last_time = max(end_time);

% solve :: int_search(start_time ++ machine ++ array1d(machine_used_ram), first_fail, indomain_split, complete)  minimize last_time;
solve :: int_search(start_time ++ machine ++ array1d(machine_used_ram), first_fail, indomain_split, complete) minimize machines_used;

constraint
  forall(t in 1..num_tasks) (
    end_time[t] = start_time[t] + duration[t] -1
  ) 
  % /\ cumulative(start_time,duration,[1 | i in  1..num_tasks],machines_used)
  /\
  forall(m in 1..num_machines) (
     % check all the times when a machine is used
     forall(tt in 1..max_time) (
        machine_used_ram[m,tt] = sum([memory[t]*(machine[t]=m)*(tt in start_time[t]..end_time[t]) | t in 1..num_tasks]) /\ 
        machine_used_ram[m,tt] <= machines_memory[m]

        % sum([memory[t]*(machine[t]=m)*(tt in start_time[t]..end_time[t]) | t in 1..num_tasks]) <= machines_memory[m]
     )

  )

  %  ensure that machine m is used before machine m+1 (for machine_used)
  /\ value_precede_chain([i | i in 1..num_machines],machine)
;

output [
  "start_time: \(start_time)\n",
  "durations : \(duration)\n",
  "end_time  : \(end_time)\n",
  "memory    : \(memory)\n",
  "last_time : \(last_time)\n",
  "machine   : \(machine)\n",
  "machines_used: \(machines_used)\n",
]
++
[ "Machine memory per time:\n    "]
++
[ show_int(3,tt) | tt in 1..max_time ]
++
[
  if tt = 1 then "\n" ++ "M" ++ show_int(2, m) ++ ": "  else " " endif ++
 show_int(2,machine_used_ram[m,tt])
  | m in 1..num_machines, tt in 1..max_time
]
++ ["\n\nTime / task: machine(task's memory)\n  Task "] ++
[
  show_int(7,t)
  | t in 1..num_tasks
]
++ 
[
  if t = 1 then "\nTime " ++ show_int(2,tt) ++ " " else " " endif ++
    if tt in fix(start_time[t])..fix(end_time[t]) then
      show_int(2,fix(machine[t])) ++ "(" ++ show_int(2,memory[t]) ++    ")"
    else 
       "      " 
    endif 
   | tt in 1..fix(last_time), t in 1..num_tasks
 ] 
;

The model has two "modes": one to minimize the time ("minimize last_time") and one to minimize the number of machine used ("minimize machines_used").

The result of minimizing the time is:

start_time: [11, 8, 3, 8, 11, 8, 9, 7, 8, 1]
durations : [2, 4, 6, 5, 2, 1, 4, 6, 3, 12]
end_time  : [12, 11, 8, 12, 12, 8, 12, 12, 10, 12]
memory    : [1, 2, 4, 2, 1, 8, 12, 4, 1, 10]
last_time : 12
machine   : [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
machines_used: 1
Machine memory per time:
       1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
M 1: 10 10 14 14 14 14 18 31 31 31 32 30  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
M 2:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
M 3:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
M 4:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
M 5:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
M 6:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
M 7:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
M 8:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
M 9:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
M10:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

 Time / task: machine(task's memory)
  Task       1      2      3      4      5      6      7      8      9     10
 Time  1                                                                 1(10)
 Time  2                                                                 1(10)
 Time  3                1( 4)                                            1(10)
 Time  4                1( 4)                                            1(10)
 Time  5                1( 4)                                            1(10)
 Time  6                1( 4)                                            1(10)
 Time  7                1( 4)                              1( 4)         1(10)
 Time  8         1( 2)  1( 4)  1( 2)         1( 8)         1( 4)  1( 1)  1(10)
 Time  9         1( 2)         1( 2)                1(12)  1( 4)  1( 1)  1(10)
 Time 10         1( 2)         1( 2)                1(12)  1( 4)  1( 1)  1(10)
 Time 11  1( 1)  1( 2)         1( 2)  1( 1)         1(12)  1( 4)         1(10)
 Time 12  1( 1)                1( 2)  1( 1)         1(12)  1( 4)         1(10)
 ----------
 ==========

The first part "Machine memory per time" shows how loaded each machine (1..10) is per time step (1..30). The second part "Time / task: machine(task's memory)" shows for each time step (rows) and tasks (columns) which machine that is used and the memory of the task in the form "machine(memory of the machine)".

The second way of using the model, to minimize the number of used machines, give this result (edited to save space). I.e. one machine is enough for handling all the tasks during the allowed time (1..22 time steps).

 start_time: [19, 11, 3, 9, 20, 22, 13, 7, 17, 1]
 durations : [2, 4, 6, 5, 2, 1, 4, 6, 3, 12]
 end_time  : [20, 14, 8, 13, 21, 22, 16, 12, 19, 12]
 memory    : [1, 2, 4, 2, 1, 8, 12, 4, 1, 10]
 last_time : 22
 machine   : [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
 machines_used: 1
 Machine memory per time:
       1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
 M 1: 10 10 14 14 14 14 18 18 16 16 18 18 16 14 12 12  1  1  2  2  1  8  0  0  0  0  0  0  0  0
 M 2:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
 ....
 Time / task: machine(task's memory)
   Task       1      2      3      4      5      6      7      8      9     10
 Time  1                                                                 1(10)
 Time  2                                                                 1(10)
 Time  3                1( 4)                                            1(10)
 Time  4                1( 4)                                            1(10)
 .....
 ----------
 ==========
查看更多
登录 后发表回答