A* Admissible Heuristic for die rolling on grid

2020-02-16 09:20发布

问题:

I need some help finding a good heuristic for the following problem:

You are given an R-by-C grid and a six-sided die. Let start and end be two distinct cells on this grid. Find a path from start to end such that the sum of the faces of the die looking up, as the die is turning along the path, is minimal.

The starting orientation of the die is the following (the "2" is facing south):

The way I modeled this problem is by considering the value of the die's face as the cost of an edge in a graph. The graph's vertices are of the form (row, col, die) (i.e, a position in the grid and the current state/orientation of the die). The reason a vertex is not simply (row, col) is because you can end up on the same cell with multiple configurations/orientations of the die.

I used A* to find the solution to the problem; the answers given are correct, but it is not efficient enough. I've determined that the problem is the heuristic I'm using. Currently I'm using Manhattan distance, which is obviously admissible. If I multiply the heuristic with a constant, it's no longer admissible: it runs much faster but it doesn't always find the right answer.

I need some help in finding a better heuristic than Manhattan distance.

回答1:

Here's my algorithm applied to Paul's example of a 300x300 grid, starting from (23,25) and ending at (282, 199). It finds the minimum path and sum (1515, which is 2 points less than Paul's result of 1517) in 0.52 seconds. A version with look-up tables instead of calculating the small sections took 0.13 seconds.

Haskell code:

import Data.List (minimumBy)
import Data.Ord (comparing)
import Control.Monad (guard)

rollDie die@[left,right,top,bottom,front,back] move
  | move == "U" = [left,right,front,back,bottom,top]
  | move == "D" = [left,right,back,front,top,bottom]
  | move == "L" = [top,bottom,right,left,front,back]
  | move == "R" = [bottom,top,left,right,front,back]

dieTop die = die!!2

--dieStartingOrientation = [4,3,1,6,2,5] --left,right,top,bottom,front,back

rows = 300
columns = 300

paths (startRow,startColumn) (endRow,endColumn) dieStartingOrientation = 
  solve (dieTop dieStartingOrientation,[]) [(startRow,startColumn)] dieStartingOrientation where
    leftBorder = max 0 (min startColumn endColumn)
    rightBorder = min columns (max startColumn endColumn)
    topBorder = endRow
    bottomBorder = startRow
    solve result@(cost,moves) ((i,j):pathTail) die =
      if (i,j) == (endRow,endColumn)
         then [(result,die)]
         else do
           ((i',j'),move) <- ((i+1,j),"U"):next
           guard (i' <= topBorder && i' >= bottomBorder && j' <= rightBorder && j' >= leftBorder)
           solve (cost + dieTop (rollDie die move),move:moves) ((i',j'):(i,j):pathTail) (rollDie die move) 
        where next | null pathTail            = [((i,j+1),"R"),((i,j-1),"L")]
                   | head pathTail == (i,j-1) = [((i,j+1),"R")]
                   | head pathTail == (i,j+1) = [((i,j-1),"L")]
                   | otherwise                = [((i,j+1),"R"),((i,j-1),"L")]

--300x300 grid starting at (23, 25) and ending at (282,199)

applicationNum = 
  let (r,c) = (282-22, 199-24)
      numRowReductions = floor (r/4) - 1
      numColumnReductions = floor (c/4) - 1
      minimalR = r - 4 * fromInteger numRowReductions
      minimalC = c - 4 * fromInteger numColumnReductions
  in (fst . fst . minimumBy (comparing fst) $ paths (1,1) (minimalR,minimalC) [4,3,1,6,2,5])
     + 14*numRowReductions + 14*numColumnReductions

applicationPath = [firstLeg] ++ secondLeg ++ thirdLeg
               ++ [((0,["R"]),[])] ++ [minimumBy (comparing fst) $ paths (1,1) (2,4) die2] 
    where
      (r,c) = (282-22, 199-24) --(260,175)
      numRowReductions = floor (r/4) - 1
      numColumnReductions = floor (c/4) - 1
      minimalR = r - 4 * fromInteger numRowReductions
      minimalC = c - 4 * fromInteger numColumnReductions
      firstLeg = minimumBy (comparing fst) $ paths (1,1) (minimalR,minimalC) [4,3,1,6,2,5]
      die0 = snd firstLeg
      secondLeg = tail . foldr mfs0 [((0,["R"]),die0)] $ [1..numColumnReductions - 1]
      die1 = snd . last $ secondLeg
      thirdLeg = tail . foldr mfs1  [((0,[]),die1)] $ [1..numRowReductions - 3 * div (numColumnReductions - 1) 4 - 1]
      die2 = rollDie (snd . last $ thirdLeg) "R"
      mfs0 a b = b ++ [((0,["R"]),[])] ++ [minimumBy (comparing fst) $ paths (1,1) (4,4) (rollDie (snd . last $ b) "R")]
      mfs1 a b = b ++ [((0,["U"]),[])] ++ [minimumBy (comparing fst) $ paths (1,1) (4,1) (rollDie (snd . last $ b) "U")]

Output:

*Main> applicationNum
1515

*Main> applicationPath
[((31,["R","R","R","R","U","U","R","U","R"]),[5,2,1,6,4,3])
,((0,["R"]),[]),((25,["R","R","R","U","U","U"]),[3,4,1,6,5,2])
,((0,["R"]),[]),((24,["R","U","R","R","U","U"]),[5,2,1,6,4,3])
................((17,["R","R","R","U"]),[5,2,1,6,4,3])]
(0.52 secs, 32093988 bytes)

List of "R" and "U":

*Main> let listRL = concatMap (\((a,b),c) -> b) applicationPath
*Main> listRL
["R","R","R","R","U","U","R","U","R","R","R","R","R","U","U","U","R","R","U","R"
..."U","R","R","R","R","U"]

Sum of the path using the starting die and list of "R" and "U":

*Main> let sumPath path = foldr (\move (cost,die) -> (cost + dieTop (rollDie die move), rollDie die move)) (1,[4,3,1,6,2,5]) path
*Main> sumPath listRL
(1515,[5,2,1,6,4,3])

Calculation of (r,c) from (1,1) using the list of "R" and "U" (since we start at (1,1,), (r,c) gets adjusted to (282-22, 199-24):

*Main> let rc path = foldr (\move (r,c) -> if move == "R" then (r,c+1) else (r+1,c)) (1,1) path
*Main> rc listRL 
(260,175)


Algorithm/Solution

Continuing the research below, it seems that the minimal face-sum path (MFS) 
can be reduced, mod 4, by either rows or columns like so:

MFS (1,1) (r,c) == MFS (1,1) (r-4,c) + 14, for r > 7
                == MFS (1,1) (r,c-4) + 14, for c > 7

This makes finding the number for the minimal path straightforward:

MFS (1,1) (r,c) = 
  let numRowReductions = floor (r/4) - 1
      numColumnReductions = floor (c/4) - 1
      minimalR = r - 4 * numRowReductions
      minimalC = c - 4 * numColumnReductions
  in MFS (1,1) (minimalR,minimalC) + 14*numRowReductions + 14*numColumnReductions

minimalR and minimalC are always less than eight, which means we can easily
pre-calculate the minimal-face-sums for these and use that table to quickly
output the overall solution.

But how do we find the path?
From my testing, it seems to work out similarly:

MFS (1,1) (1,anything) = trivial
MFS (1,1) (anything,1) = trivial
MFS (1,1) (r,c), for r,c < 5 = calculate solution in your favorite way
MFS (1,1) (r,c), for either or both r,c > 4 = 
  MFS (1,1) (minimalR,minimalC) -> roll -> 
  MFS (1,1) (min 4 r-1, min 4 c-1) -> roll -> 
  ...sections must be arranged so the last one includes 
     four rotations for one axis and at least one for the other.
     keeping one row or column the same till the end seems to work.
     (For Paul's example above, after the initial MFS box, I moved in 
     fours along the x-axis, rolling 4x4 boxes to the right, which 
     means the y-axis advanced in threes and then a section in fours 
     going up, until the last box of 2x4. I suspect, but haven't checked, 
     that the sections must divide at least one axis only in fours for 
     this to work)...
  MFS (1,1) (either (if r > 4 then 4 else min 2 r, 4) 
             or     (4, if c > 4 then 4 else min 2 c))
  => (r,c) is now reached

For example,

  MFS (1,1) (5,13) = MFS (1,1) (1,5) -> roll right -> 
                     MFS (1,1) (1,4) -> roll right -> MFS (1,1) (5,4)

  MFS (1,1) (2,13) = MFS (1,1) (1,5) -> roll right -> 
                     MFS (1,1) (1,4) -> roll right -> MFS (1,1) (2,4)


Properties of Dice Observed in Empirical Testing

For target points farther than (1,1) to (2,3), for example (1,1) to (3,4)
or (1,1) to (4,6), the minimum path top-face-sum (MFS) is equal if you 
reverse the target (r,c). In other words: 

1. MFS (1,1) (r,c) == MFS (1,1) (c,r), for r,c > 2

Not only that.

2. MFS (1,1) (r,c) == MFS (1,1) (r',c'), for r,c,r',c' > 2 and r + c == r' + c'
   e.g., MFS (1,1) (4,5) == MFS (1,1) (5,4) == MFS (1,1) (3,6) == MFS (1,1) (6,3)

But here's were it gets interesting:

The MFS for any target box (meaning from startPoint to endPoint) that
can be reduced to a symmetrical combination of (r,c) (r,c) or (r,c) (c,r), for 
r,c > 2, can be expressed as the sum of the MFS of the two smaller symmetrical 
parts, if the die-roll (the change in orientation) between the two parts is 
accounted for. In other words, if this is true, we can breakdown the calculation
into smaller parts, which is much much faster.

For example: 
 Target-box (1,1) to (7,6) can be expressed as: 
 (1,1) (4,3) -> roll right -> (1,1) (4,3) with a different starting orientation

 Check it, baby: 
 MFS  (1,1) (7,6) = MFS (1,1) (4,3) + MFS (1,1) (4,3) 
 (when accounting for the change in starting orientation, rolling right in 
 between)

 Eq. 2., implies that MFS (1,1) to (7,6) == MFS (1,1) (5,8)
 and MFS (1,1) (5,8) can be expressed as (1,1) (3,4) -> roll right -> (1,1) (3,4)

 Check it again: 
 MFS (1,1) (7,6) = MFS (1,1) (5,8) = MFS (1,1) (3,4) + MFS (1,1) (3,4)
 (when accounting for the change in starting orientation, rolling right in
 between)

Not only that.

The symmetrical parts can apparently be combined in any way:

3. MFS (1,1) (r,c) -> roll-right -> MFS (1,1) (r,c) equals
   MFS (1,1) (r,c) -> roll-right -> MFS (1,1) (c,r) equals
   MFS (1,1) (r,c) -> roll-up -> MFS (1,1) (r,c) equals
   MFS (1,1) (r,c) -> roll-up -> MFS (1,1) (c,r) equals
   MFS (1,1) (2*r-1, 2*c) equals
   MFS (1,1) (2*r, 2*c-1), for r,c > 2


回答2:

Well, I'll add my comment here, since it is more optimal than the current highest-voted answer by @larsmans - but, I am convinced there must be something better (hence the bounty).


If I multiply the heuristic with a constant, it's no longer admissible

The best I can come up with is (manhattenDistance/3)*6 + (manhattenDistance%3), where / is integer division and % is mod. This works because in any 3 moves with no back-tracking, all three digits will be unique, so the lowest sum we could have is 1+2+3 = 6 (The %3 simply adds any extra, non-multiple-of-three moves).


[Edit] As @GrantS pointed out in the comments above, my heuristic can be improved very slightly by adding an additional 1 when manhattenDistance%3 == 2. This is easy to do without a conditional: (manhattenDistance/3)*6 + (manhattenDistance%3)*3/2



回答3:

Main Edit 3: Proof that the optimal admissible heuristic should be based on 3.5m

The average cost of travelling along the board has to approach 3.5m over the long run where m is the Manhattan distance. Therefore the best admissible heuristic should be 3.5m plus or minus some small constant.

The reason for this is that whenever you move in a direction, x, say, from face x1, the next move in the same direction, to face x2 has to satisfy x1 + x2 = 7. This is because any moves in the perpendicular direction leave the orientation of face x2 the same. Think about rotating a die left to right -- the front and back faces stay the same no matter how many rotations you do. Conversely if you rotate a die front to back, the left and right faces stay the same.

It's easiest to see this with some examples (all starting in the configuration pictured in the question)

   6
2453
1

here you can see that we start with y1=1, and however many times we move in the x-direction afterwards, the next move in the y-direction has to be y2=6, so y1+y2=7. (Also in the x-direction, there is a simple pairing of 2+5 = 7 and 4+3 = 7).

Another example is

  35
 26
14

In this example we start with x1=1, and however many times we move in the y-direction afterwards, the next move in the x-direction has to be x2=6. (Also, we see pairings of 4+3=7 in the y-direction, 2+5=7 in the x-direction. And we know in this case the next move in the x-direction has to be 4, and the next move in the y-direction has to be 1.)

This all assumes it's never worth backtracking, but hopefully this can be taken as read.

The original post below just fills in some details of how the estimate of 3.5m should be adjusted to take account of the ability for it to be beaten over the short term.

As a side-note, as I just commented on the OP, A* search might not be required at all. It ought to make sense to simply choose a path made out of 4-long horizontal pieces and 4-long vertical pieces, say, which are optimal. And then make up the remainder with a search or a lookup table based on orientation and x-y offset. (But the question asks for an admissible heuristic so I'm going to leave my answer.)

Main Edit 2: summarize original empirical work, taking account of comments below

In the long term, as explained above, your average cost per move is 3.5. This can also can be seen empirically in the exploration of the data below.

This gives a naive estimate of 3.5m where m is the Manhattan distance. However this is an over-estimate, because in the short term it is possible to do better than the average. A good hypothesis for this is to explore how we can avoid using any faces larger than 3.

  • If we start with face 1, we can use faces 2 and 3 on our first 2 moves, going 2 moves better than 3.5m predicts.
  • If we start with face 2, we can use faces 1 and 3 on our first 2 moves, going 3 moves better than 3.5m predicts.
  • If we start with face 3, we can use faces 1 and 2 on our first 2 moves, going 4 moves better than 3.5m predicts.
  • If we start with face 4,5, or 6, we can use faces 1, 2 and 3 on our first 3 moves, going 4.5 moves better than 3.5m predicts.

This hypothesis can be confirmed empirically by simply running the script below for every starting possibility of the die, as suggested by BlueRaja - Danny Pflughoeft. So a simple admissible statistic is 3.5m - k, where k = max(f+1, 4.5), and f is the starting face. But this is a bit klunky, giving negative numbers for small values of m. It's easy to write a programmatic version that takes account of whether you have just 1 or 2 or 3 moves to go, see below

    static double Adm(int x, int y, int face /* start face */, out int m)
    {
        double adm = 0;
        m = Math.Abs(x) + Math.Abs(y);
        if (m >= 1) {
            if (face == 1)
                adm += 2;
            else
                adm += 1;
            m--;
        }
        if (m >= 1) {
            if (face <= 2)
                adm += 3;
            else
                adm += 2;
            m--;
        }
        if (m >= 1 && face >=4) {
            // 4,5,6: we can still use a 3 without backtracking
            adm += 3;
            m--;
        }

        adm += 3.5 * m;

        return adm;
    }

Running this across a search space with |x|,|y| <= 100, this function underestimates the actual cost by between 0 and 6, with a median of 0.5 or 1.5 depending on the start face.

Main Edit 1: original post

My basic thought was that it would be good to explore the data. So I had a go at Dijkstra's algorithm to see what the space of solutions looks like. What I found is supportive of what's been said already. Some factor times the Manhattan distance is appropriate, but there may be some justification for a higher factor than 1.5. This is nicely indicated by the shape of a contour plot of cost against deviation from initial x y position.

Here's a wire frame plot -- to be honest this is just for eye candy.

What's interesting is that if you add another column to your data for the manhattan distance (man) and regress the cost (v) against the manhattan distance in R, you get the following

Coefficients:
          Estimate Std. Error  t value Pr(>|t|)    
(Intercept) -0.6408087  0.0113650   -56.38   <2e-16
df$man       3.4991861  0.0001047 33421.66   <2e-16

I.e. it's telling you that for every move you make horizontally or vertically, your cost is 3.4991861, or v close to 3.5. That just happens to be the average of 1 to 6, so my intuition is that the data is telling us that on average, it's most efficient to use all the faces of the die equally over a long distance. Over short distances you can be more optimal.

I tried 3.5man - k as an estimate, with k = 2.5. This seemed to work ok. When I subtracted the actual cost from this I got -0.5 as the highest value.

> summary(df$est - df$v)
  Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 -6.500  -2.500  -2.000  -1.777  -1.000  -0.500 

However A* search has to work for all configurations including those after the start where the die is not in the original configuration, so the constant k can't be as low as 2.5 in general. It either needs to be raised, e.g. to 4, or be dependent on the configuration of the die, as suggested in another answer.

It's quite possible that I've made some horrible mistake in all of this so I've put the code below. Like I said, I think the approach of generating the data and investigating it is sound even if my results aren't.

Here are some lines of the result file first.

17,-100,410

17,-99,406

17,-98,403

17,-97,399

17,-96,396

C# code

class Die
{
    int top;
    int bottom;
    int front;
    int back;
    int left;
    int right;

    public int Top { get { return top; } }
    public int Bottom { get { return bottom; } }
    public int Front { get { return front; } }
    public int Back { get { return back; } }
    public int Left { get { return left; } }
    public int Right { get { return right; } }

    public Die(int top, int bot, int fro, int bac, int lef, int rig)
    {
        this.top = top;
        bottom = bot;
        front = fro;
        back = bac;
        left = lef;
        right = rig;
    }

    public Die RotateLeft()
    {
        return new Die(
               top: right,
               rig: bottom,
               bot: left,
               lef: top,
               fro: front,
               bac: back
            );
    }

    public Die RotateRight()
    {
        return new Die(
               rig: top,
               top: left,
               lef: bottom,
               bot: right,
               fro: front,
               bac: back
            );
    }

    public Die RotateUp()
    {
        return new Die(
                top: front,
                fro: bottom,
                bot: back,
                bac: top,
                lef: left,
                rig: right
             );
    }

    public Die RotateDown()
    {
        return new Die(
                fro: top,
                top: back,
                bac: bottom,
                bot: front,
                lef: left,
                rig: right
            );
    }
}



class DieXY

{

    public Die Die { get; set; }
    public int X { get; set; }
    public int Y { get; set; }

    public DieXY(Die die, int x, int y) { Die = die; X = x; Y = y; }

    public override int GetHashCode() { return Die.Top + Die.Bottom*6 + Die.Front*6^2 + Die.Back*6^3 + Die.Left*6^4 + Die.Right*6^5 + X*6^6 + Y*6^8; }
    public override bool Equals(object obj)
    {
        DieXY die = (DieXY)obj;
        return die != null
            && die.Die.Top == Die.Top && die.Die.Bottom == Die.Bottom
            && die.Die.Front == Die.Front && die.Die.Back == Die.Back
            && die.Die.Left == Die.Left && die.Die.Right == Die.Right 
            && die.X == X && die.Y == Y;
    }
}

class Program
{
    static void Main(string[] args)
    {
        Dictionary<DieXY, int> dict = new Dictionary<DieXY, int>();
        int n = 100;
        int sofar = -1;
        DieXY root = new DieXY(new Die(1, 6, 2, 5, 4, 3), 0, 0);
        Queue<Tuple<DieXY, int>> queue = new Queue<Tuple<DieXY, int>>();
        queue.Enqueue(new Tuple<DieXY,int>(root,0));
        while (queue.Count > 0)
        {
            Tuple<DieXY, int> curr = queue.Dequeue();
            DieXY dieXY = curr.Item1;
            Die die = dieXY.Die;
            int x = dieXY.X;
            int y = dieXY.Y;
            if (Math.Max(x,y) > sofar)
            {
                sofar = Math.Max(x, y);
                Console.WriteLine("{0}", sofar);
            }
            int score = curr.Item2;
            if (Math.Abs(x) <= n && Math.Abs(y) <= n)
            {
                int existingScore = 0;
                if (!dict.TryGetValue(dieXY, out existingScore)
                    || score < existingScore)
                {
                    dict[dieXY] = score;
                    Die newDie = null;
                    newDie = die.RotateLeft();
                    queue.Enqueue(new Tuple<DieXY, int>(new DieXY(newDie, x - 1, y), score + newDie.Top));
                    newDie = die.RotateRight();
                    queue.Enqueue(new Tuple<DieXY, int>(new DieXY(newDie, x + 1, y), score + newDie.Top));
                    newDie = die.RotateUp();
                    queue.Enqueue(new Tuple<DieXY, int>(new DieXY(newDie, x, y + 1), score + newDie.Top));
                    newDie = die.RotateDown();
                    queue.Enqueue(new Tuple<DieXY, int>(new DieXY(newDie, x, y - 1), score + newDie.Top));
                }
            }
        }

        int[,] scores = new int[2*n+1,2*n+1];
        for (int aX = 0; aX < 2 * n + 1; aX++)
            for (int aY = 0; aY < 2 * n + 1; aY++)
                scores[aX, aY] = int.MaxValue;
        foreach (KeyValuePair<DieXY, int> curr in dict)
        {
            int aX = curr.Key.X + n;
            int aY = curr.Key.Y + n;
            if (curr.Value < scores[aX, aY])
            {
                scores[aX, aY] = curr.Value;
            }
        }

        using (System.IO.StreamWriter file = new System.IO.StreamWriter("out.csv"))
        {
            file.WriteLine("x,y,v");
            for (int aX = 0; aX < 2*n+1; aX++)
            {
                int x = aX - n;
                for (int aY = 0; aY < 2 * n + 1; aY++)
                {
                    int y = aY - n;
                    file.WriteLine("{0},{1},{2}", x, y, scores[aX, aY]);
                }
            }
        }

        Console.WriteLine("Written file");
        Console.ReadKey();
    }
}

R code below

library(lattice)
df = read.csv("out.csv")
df=transform(df, man=abs(x)+abs(y))

v50=df[abs(df$x)<=50 & abs(df$y)<=50,]
with(v50, wireframe(v ~ x*y))
with(v50, contourplot(v ~ x*y))

summary(lm(df$v ~ df$man))

df$est = df$man * 3.5 - 2.5
summary(df$est - df$v)


回答4:

If I multiply the heuristic with a constant, it's no longer admissible

It can be if you get rid of some corner cases. Let d be the Manhattan distance, and observe that the die can never have its 1 face up in two subsequent steps of the path. It follows that, if you're not already at the goal:

  • the first step has cost at least 1;
  • if 1 is face up, it's at least 2 (and the same holds for 6);
  • the rest of the path is at least as expensive as a series of 1-2 alternations, which has cost 1.5 × (d - 1).

So an admissible heuristic is

if d == 0 then
    h := 0
else if die == 1 or die == 6 then
    h := 2 + 1.5 × (d - 1)
else
    h := 1 + 1.5 × (d - 1)


回答5:

Idea:

If you have to move in a straight line, the best you can do is to end your moves with 1 and 2, for all other moves you can't do better than 3.5*distance.

Heuristic:

With ManhattanDistance = x + y the following heuristic could be used:

Heuristic = xH + yH;

where

xH = calculateStraightLineHeuristic(x)
yH = calculateStraightLineHeuristic(y)

and the function calculateStraightLineHeuristic(z) is defined as follows:

calculateStraightLineHeuristic(z)
    if (z = 1)
        return zH = 1
    elseif (z = 2)
        return zH = 2+1
    else
        return zH = (z-2)*3.5+2+1
end