Minimizing time in transit

[Updates at bottom (including solution source code)]

I have a challenging business problem that a computer can help solve.

Along a mountainous region flows a long winding river with strong currents. Along certain parts of the river are plots of environmentally sensitive land suitable for growing a particular type of rare fruit that is in very high demand. Once field laborers harvest the fruit, the clock starts ticking to get the fruit to a processing plant. It's very costly to try and send the fruits upstream or over land or air. By far the most cost effective mechanism to ship them to the plant is downstream in containers powered solely by the river's constant current. We have the capacity to build 10 processing plants and need to locate these along the river to minimize the total time the fruits spend in transit. The fruits can take however long before reaching the nearest downstream plant but that time directly hurts the price at which they can be sold. Effectively, we want to minimize the sum of the distances to the nearest respective downstream plant. A plant can be located as little as 0 meters downstream from a fruit access point.

The question is: In order to maximize profits, how far up the river should we build the 10 processing plants if we have found 32 fruit growing regions, where the regions' distances upstream from the base of the river are (in meters): 10, 40, 90, 160, 250, 360, 490, ... (n^2)*10 ... 9000, 9610, 10320?

[It is hoped that all work going towards solving this problem and towards creating similar problems and usage scenarios can help raise awareness about and generate popular resistance towards the damaging and stifling nature of software/business method patents (to whatever degree those patents might be believed to be legal within a locality).]

UPDATES

Update1: Forgot to add: I believe this question is a special case of this one.

Update2: One algorithm I wrote gives an answer in a fraction of a second, and I believe is rather good (but it's not yet stable across sample values). I'll give more details later, but the short is as follows. Place the plants at equal spacings. Cycle over all the inner plants where at each plant you recalculate its position by testing every location between its two neighbors until the problem is solved within that space (greedy algorithm). So you optimize plant 2 holding 1 and 3 fixed. Then plant 3 holding 2 and 4 fixed... When you reach the end, you cycle back and repeat until you go a full cycle where every processing plant's recalculated position stops varying.. also at the end of each cycle, you try to move processing plants that are crowded next to each other and are all near each others' fruit dumps into a region that has fruit dumps far away. There are many ways to vary the details and hence the exact answer produced. I have other candidate algorithms, but all have glitches. [I'll post code later.] Just as Mike Dunlavey mentioned below, we likely just want "good enough".

To give an idea of what might be a "good enough" result:

10010 total length of travel from 32 locations to plants at 
{10,490,1210,1960,2890,4000,5290,6760,8410,9610}

Update3: mhum gave the correct exact solution first but did not (yet) post a program or algorithm, so I wrote one up that yields the same values.

/************************************************************
This program can be compiled and run (eg, on Linux):
$ gcc -std=c99 processing-plants.c -o processing-plants
$ ./processing-plants
************************************************************/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

//a: Data set of values. Add extra large number at the end

int a[]={
10,40,90,160,250,360,490,640,810,1000,1210,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240,99999
};

//numofa: size of data set

int numofa=sizeof(a)/sizeof(int);

//a2: will hold (pt to) unique data from a and in sorted order.

int *a2;

//max: size of a2

int max;

//num_fixed_loc: at 10 gives the solution for 10 plants

int num_fixed_loc;

//xx: holds index values of a2 from the lowest error winner of each cycle memoized. accessed via memoized offset value. Winner is based off lowest error sum from left boundary upto right ending boundary.
//FIX: to be dynamically sized.

int xx[1000000];

//xx_last: how much of xx has been used up

int xx_last=0;

//SavedBundle: data type to "hold" memoized values needed (total traval distance and plant locations) 

typedef struct _SavedBundle {
    long e;
    int xx_offset;
} SavedBundle;

//sb: (pts to) lookup table of all calculated values memoized

SavedBundle *sb;  //holds winning values being memoized

//Sort in increasing order.

int sortfunc (const void *a, const void *b) {
    return (*(int *)a - *(int *)b);
}

/****************************
Most interesting code in here
****************************/

long full_memh(int l, int n) {
    long e;
    long e_min=-1;
    int ti;

    if (sb[l*max+n].e) {
        return sb[l*max+n].e;  //convenience passing
    }
    for (int i=l+1; i<max-1; i++) {
        e=0;
        //sum first part
        for (int j=l+1; j<i; j++) {
            e+=a2[j]-a2[l];
        }
        //sum second part
        if (n!=1) //general case, recursively
            e+=full_memh(i, n-1);
        else      //base case, iteratively
            for (int j=i+1; j<max-1; j++) {
                e+=a2[j]-a2[i];
            }
        if (e_min==-1) {
            e_min=e;
            ti=i;
        }
        if (e<e_min) {
            e_min=e;
            ti=i;
        }
    }
    sb[l*max+n].e=e_min;
    sb[l*max+n].xx_offset=xx_last;
    xx[xx_last]=ti;      //later add a test or a realloc, etc, if approp
    for (int i=0; i<n-1; i++) {
        xx[xx_last+(i+1)]=xx[sb[ti*max+(n-1)].xx_offset+i];
    }
    xx_last+=n;
    return e_min;
}

/*************************************************************
Call to calculate and print results for given number of plants
*************************************************************/

int full_memoization(int num_fixed_loc) {
    char *str;
    long errorsum;  //for convenience

    //Call recursive workhorse
    errorsum=full_memh(0, num_fixed_loc-2);
    //Now print
    str=(char *) malloc(num_fixed_loc*20+100);
    sprintf (str,"\n%4d %6d {%d,",num_fixed_loc-1,errorsum,a2[0]);
    for (int i=0; i<num_fixed_loc-2; i++)
        sprintf (str+strlen(str),"%d%c",a2[ xx[ sb[0*max+(num_fixed_loc-2)].xx_offset+i ] ], (i<num_fixed_loc-3)?',':'}');
    printf ("%s",str);
    return 0;
}

/**************************************************
Initialize and call for plant numbers of many sizes
**************************************************/

int main (int x, char **y) {
    int t;
    int i2;

    qsort(a,numofa,sizeof(int),sortfunc);
    t=1;
    for (int i=1; i<numofa; i++)
        if (a[i]!=a[i-1])
            t++;
    max=t;
    i2=1;
    a2=(int *)malloc(sizeof(int)*t);
    a2[0]=a[0];
    for (int i=1; i<numofa; i++)
        if (a[i]!=a[i-1]) {
            a2[i2++]=a[i];
        }
    sb = (SavedBundle *)calloc(sizeof(SavedBundle),max*max);
    for (int i=3; i<=max; i++) {
        full_memoization(i);
    }
    free(sb);
    return 0;
}

标签： algorithm optimization shortest-path

2条回答

甜甜的少女心

2楼-- · 2019-05-25 05:51

Let me give you a simple example of a Metropolis-Hastings algorithm. Suppose you have a state vector x, and a goodness-of-fit function P(x), which can be any function you care to write.

Suppose you have a random distribution Q that you can use to modify the vector, such as x' = x + N(0, 1) * sigma, where N is a simple normal distribution about 0, and sigma is a standard deviation of your choosing.

p = P(x);
for (/* a lot of iterations */){
  // add x to a sample array
  // get the next sample
  x' = x + N(0,1) * sigma;
  p' = P(x');
  // if it is better, accept it
  if (p' > p){
    x = x';
    p = p';
  }
  // if it is not better
  else {
    // maybe accept it anyway
    if (Uniform(0,1) < (p' / p)){
      x = x';
      p = p';
    }
  }
}

Usually it is done with a burn-in time of maybe 1000 cycles, after which you start collecting samples. After another maybe 10,000 cycles, the average of the samples is what you take as an answer.

It requires diagnostics and tuning. Typically the samples are plotted, and what you are looking for is a "fuzzy caterpilar" plot that is stable (doesn't move around much) and has a high acceptance rate (very fuzzy). The main parameter you can play with is sigma. If sigma is too small, the plot will be fuzzy but it will wander around. If it is too large, the plot will not be fuzzy - it will have horizontal segments. Often the starting vector x is chosen at random, and often multiple starting vectors are chosen, to see if they end up in the same place.

It is not necessary to vary all components of the state vector x at the same time. You can cycle through them, varying one at a time, or some such method.

Also, if you don't need the diagnostic plot, it may not be necessary to save the samples, but just calculate the average and variance on the fly.

In the applications I'm familiar with, P(x) is a measure of probability, and it is typically in log-space, so it can vary from 0 to negative infinity. Then to do the "maybe accept" step it is (exp(logp' - logp))

0人赞添加讨论(0) 举报

啃猪蹄的小仙女

3楼-- · 2019-05-25 05:52

Unless I've made an error, here are exact solutions (obtained through a dynamic programming approach):

N  Dist  Sites
2  60950 {10,4840}
3  40910 {10,2890,6760}
4  30270 {10,2250,4840,7840}
5  23650 {10,1690,3610,5760,8410}
6  19170 {10,1210,2560,4410,6250,8410}
7  15840 {10,1000,2250,3610,5290,7290,9000}
8  13330 {10,810,1960,3240,4410,5760,7290,9000}
9  11460 {10,810,1690,2890,4000,5290,6760,8410,9610}
10 9850  {10,640,1440,2250,3240,4410,5760,7290,8410,9610}
11 8460  {10,640,1440,2250,3240,4410,5290,6250,7290,8410,9610}
12 7350  {10,490,1210,1960,2890,3610,4410,5290,6250,7290,8410,9610}
13 6470  {10,490,1000,1690,2250,2890,3610,4410,5290,6250,7290,8410,9610}
14 5800  {10,360,810,1440,1960,2560,3240,4000,4840,5760,6760,7840,9000,10240}
15 5190  {10,360,810,1440,1960,2560,3240,4000,4840,5760,6760,7840,9000,9610,10240}
16 4610  {10,360,810,1210,1690,2250,2890,3610,4410,5290,6250,7290,8410,9000,9610,10240}
17 4060  {10,360,810,1210,1690,2250,2890,3610,4410,5290,6250,7290,7840,8410,9000,9610,10240}
18 3550  {10,360,810,1210,1690,2250,2890,3610,4410,5290,6250,6760,7290,7840,8410,9000,9610,10240}
19 3080  {10,360,810,1210,1690,2250,2890,3610,4410,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
20 2640  {10,250,640,1000,1440,1960,2560,3240,4000,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
21 2230  {10,250,640,1000,1440,1960,2560,3240,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
22 1860  {10,250,640,1000,1440,1960,2560,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
23 1520  {10,250,490,810,1210,1690,2250,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
24 1210  {10,250,490,810,1210,1690,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
25 940   {10,250,490,810,1210,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
26 710   {10,160,360,640,1000,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
27 500   {10,160,360,640,1000,1210,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
28 330   {10,160,360,640,810,1000,1210,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
29 200   {10,160,360,490,640,810,1000,1210,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
30 100   {10,90,250,360,490,640,810,1000,1210,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
31 30    {10,90,160,250,360,490,640,810,1000,1210,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}
32 0     {10,40,90,160,250,360,490,640,810,1000,1210,1440,1690,1960,2250,2560,2890,3240,3610,4000,4410,4840,5290,5760,6250,6760,7290,7840,8410,9000,9610,10240}

0人赞添加讨论(0) 举报

Minimizing time in transit

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间