Sokoban solver "scribbles" by Brian Damgaard about the YASS solver

From Sokoban Wiki

Revision as of 09:32, 9 September 2012 by Briandamgaard (Talk | contribs)
Jump to: navigation, search

"Scribbles" about Sokoban solver programming by Brian Damgaard, in particular about the YASS solver.

(Most of the text is ripped from my private email-conversations about Sokoban solver programming. The notes are very unstructured, but even so, there are hopefully a few valuable pieces of information buried here.)

Contents

YASS overall structure

Search algorithm

A* search with depth-first search extensions.

Heuristic score

The pure A* search uses a simple lower bound on the boxes' distance to the nearest goal. This is in contrast to the bipartite matching in the "Rolling Stone" solver which I mean is flawed anyway. The packing order search uses a more complex score calculation. Each phase of the packing order search tries to bring boxes to a certain group of goals or parking squares. The score for the starting position is "infinity", and then the score decreases for each completed phase.

Transposition table

Bucket hashtable (as opposed to a direct hashtable) where each bucket is a double-linked list of members (for fast node recycling, even though it's not currently used).

Hash keys

Zobrist keys, 8 bytes. The program silently assumes that it's safe to say that an 8 byte Zobrist key always is unique. In theory, that is, of course, nonsense but in practice it's good enough.

Open queue

Uses an O(1) bucket-sort. The score is the simple lower bound, so this is as efficient as a sort possibly can be. Each bucket is a double-linked list of members (for fast node recycling, even though it's not currently used).

Goal room packing

Instead of a "goal room packing" method, the program uses a more general "packing order calculation" (POC), where boxes can be parked at intermediate squares before they are brought home to their final destion squares.

In theory, the POC isn't limited to goal-room themed levels only, but the current version only attempts to find a packing order for goal-room themed levels.

A packing order is calculated before the search starts. The current version has no mechanism for changing the order dynamically during the search.

The most remarkable result produced by the POC is that YASS can solve the "XSokoban #50" level in less than a second, and even with as little as a 1 MiB transposition table. This level requires pushing almost all the boxes through the goal area and parking them on the right side of the board before they can be pushed to their final destination squares.

Note that "XSokoban #50" is a slightly simplified version of "Original #50". The former can be solved by parking each box once, whereas the latter requires that at least one box is parked twice before it is pushed to its final destination square. Searching for such a "parking several times" packing order would in the general case be too costly, so the POC module doesn't try to do that. That explains why YASS can solve "XSokoban #50" but not "Original #50".

Pre-calculated deadlock positions

I believe YASS really shines in this area. It's vastly more advanced and better than the primitive fixed size (small 4*5 sections) in Rolling Stone. The deadlock detection in YASS is related to (but independently developed and slightly more advanced than) the capacity-counting deadlock detection mechanism in Edelkamp's work.

A deadlock situation is represented by the set of box-squares that constitutes the deadlock. Additionally there is countdown for when all the squares in the set are filled, in which case the deadlock is triggered. In other words, the deadlock mechanism is based upon "capacity-counting".

Testing and updating is extremely fast. For instance, updating each push is only O(m+n) where m and n are the number of deadlock-situations relating to the "from-square" and the "to-square" of the move.

This mechanism only allows deadlocks independent of the player position, but to squeeze the last drop of juice out of that mechanism, the program calculates an extra group of deadlocks where the direction of the last push leading to the deadlock is enough to say whether the player is inside or outside the boxes that constitute the deadlock.

If you run the console-program version of YASS with the "-log" parameter, you'll get a list of the deadlocks. The simple legend would have been to show each of the square members like "%", but in order to make it easy to verify that the program has calculated the countdown correctly for the starting position, each square is shown with normal XSB legend, but with the meaning that squares like "." "*", and "$" are members of the deadlock set. Here is an example.

First the level:

"Claire", by Lee J Haywood

#######
#.@_#_#
#$*_$_#
#___$_#
#_..__#
#__*__#
#######

A deadlocked corner, using "%" to depict the squares:

#######
#_%_#_#
#%%___#
#_____#
#_____#
#_____#
#######

What the log-file output looks like for the same deadlock:

Deadlock Set 3, Capacity: 0, Squares: ...

#######
#_@_#_#
#$*___#
#_____#
#_____#
#_____#
#######

Here is a more advanced deadlock example:

Deadlock Set 10, Capacity: 2, Squares: 21 30 39 47, Center: 29=[2,3]; player must be  outside the set and push the last box: up left

#######
#_@_###
#_*___#
#_%___#
#%____#
#_____#
#######

Notice how the constraint that the push leading to the deadlock must be "up" or "left" ensures that the position only is considered a deadlock when the player is on the outside.

Overall, I'm very pleased with this deadlock mechanism. You'll notice that a deadlock can be arbitrary complex, with squares scattered all over board. They don't need to be confined to a small area like the simple approach in Rolling Stone, where the program also needs to rotate and transform the possibly offending deadlocks for each move.

The amazing fact is also that the update and "downdate" for doing a push and taking it back is 2-3 lines of code each. You can verify that by looking at "DoPush" and "UndoPush". Here are the lines in "DoPush":

     with DeadlockSets do begin
       for i:=1 to SquareSetCount[FromSquare] do  Inc(Capacity[SquareSetNumbers[FromSquare,i]]); {leaving  these deadlock-sets}
       for i:=1 to SquareSetCount[ToSquare  ] do Dec(Capacity[SquareSetNumbers[ToSquare   ,i]]); {entering these deadlock-sets}
       end;

Tunnel Squares

Tunnel squares are precalculated based on Matthias Meger's observations and flagged directly on the board for efficient use by the move generator, similar to the way simple deadlocked squares are marked on the board, preventing boxes from going there.

A tunnel square is "slippery", meaning a box should never rest on a tunnel square. The box must immediately move ahead in its current direction. When a game position is about to be expanded, and the last move put the box into a tunnel, the move generator can ignore all other boxes and only generate the single push (or pull in reverse mode) which moves the box ahead in its current direction. This reduces not only the number of moves to try on the board, but also the number of positions stored in the transposition table because of the normalizing effect it has on the board positions, when boxes never rest on tunnel squares.

Whether a square is a tunnel square depends on:

  • the wall pattern on the board
  • the direction of the last push/pull

In the following figure, each pattern depicts the situation after a box has been pushed upwards from "Square" to "NextSquare". When "NextSquare" isn't a goal square, then it's an upwards tunnel square. The gate squares mentioned in the figure are the squares which split the board in separate rooms if a box is placed on the square, with no other boxes on the board.

#$#    #$_    _$#    #_$_#    $ = "NextSquare"
#@#    #@#    #@#     #@#     @ = "Square" (Player pos. after push)

If "NextSquare" is a gate square, then "Square" 
doesn't need to be blocked by walls on both sides.

If "Square" is a gate square and has walls on 
both sides, then "NextSquare" is a tunnel square:

_$_    $ = "NextSquare"
#@#    @ = "Square", and "Square" is a gate square

PI-Corral pruning

You can read a brief introduction to PI-Corrals on this Sokoban Wiki:

http://sokobano.de/wiki/index.php?title=Solver

PI-Corrals is a fantastic thing, algorithmically speaking. PI-CORRAL PRUNING IS AS IMPORTANT FOR SOKOBAN AS ALPHA-BETA PRUNING IS FOR 2-PLAYER GAMES like Chess, Checkers, and Othello!

When the annals of Sokoban programming is written, Matthias Meger should be credited for this amazing discovery, and I'd like to be credited for being the first programmer to turn it into an algorithm taking connected corrals into account, which is a requirement for making the idea useful in practice.

Corral pruning

I've had a correspondence with Matthias Meger for some time regarding Sokoban solvers and deadlock detections, and recently, he had a stroke of genius and discovered a completely new pruning technique for a solver! For once, I'm not exaggerating. It's so seldom one can say that a new algorithm is discovered but this is really what happened here, and like so many good ideas, it's so simple that it's immediately understandable by sight.

In fact, many human Sokoban solvers use a variant of this technique during the solving process without realizing the full implications of the method. There are no documented examples that programmers have seen it and turned it into an algorithm, and yet, the results are nothing less than staggering.

There is always an overlap when different pruning techniques are used in a solver, but conservatively speaking, this new technique is itself able to prune at least 20% of the search tree according to my experiments, and empirically the savings always outweigths the extra costs.

Solving Sokoban designs is a hard task for programs as well as humans, and this improvement in itself seldom makes it possible to solve more designs, but a so substantial reduction is definitely more than welcome and cannot be ignored.

So what is this new method? Well, first we need to establish some terms.

A "fenced-in area" is an area on the board that the player cannot reach in the current position. I've coined the phrase a "corral" for this type of areas. An example:

A corral

##### 
#__$_
#___$@
#__$_
###__

An "I-corral" is a corral where all boxes in the outer fence only can move inwards in their first move. An example:

Here is an I-corral which also happens to be a "PI-corral", a term we'll introduce in a moment:

##### 
#___$
#__$@
#__$_
###__

Note how the corral in the first example wasn't an I-corral because the middle box could be pushed to a cell outside the corral.

A "PI-corral" is the important sub-class of the I-corrals where the player can reach all the outer corral boxes and perform all their legal inwards moves. The example above was a PI-corral but the concept is best illustrated by a counter-example:

An I-corral which isn't a PI-corral because of the outside blocking box:

##### 
#___$
#__$@
#__$$
###__

This example isn't a PI-corral because the player cannot push the 3rd box inwards; there is an outside box that blocks the access.

Now, why are PI-corrals important? Here is where Matthias Meger's had a new insight.

Unless all boxes in the corral are sitting on goal-cells, then the player must sooner or later do something about the corral, that is, the player must push at least one of the boxes in the fence.

The key-insight is this: Pushing a box inwards, inside the corral, never affects the player's access to boxes outside the corral. Therefore, when we choose which box to push next, we might as well push one of the corral-boxes because we have to do that sooner or later anyway.

In the next push, we still have full access to all boxes outside the corral so we can pick one of them, if we wish to. However, if there still is a PI-corral on the board, we continue only selecting the corral-boxes for move-generation.

So the bottom line is that for each position we meet during the solving process, we first check for PI-corrals, and if we find one, we never generate pushes for boxes outside the PI-corral.

In practice, it's rather seldom that a board contains simple PI-corrals like the one depicted above. Even when there is an I-corral on the board, the player seldom has access to all the boxes (from all the required directions). Often, the player's access to the I-corral boxes is blocked by neighbouring corrals like in this (small) example:

##### 
#___$
#__$@
#__$$
###_$
__#_$
__#$

This is where I came into the picture and had the pleasure to contribute to the implementation of this brilliant insight. When the program detects the top-left corral labeled "1" in the following figure, I made an algorithm that efficiently adds neighboring corrals so the key insight about the player's access to fence-boxes still can be put to use:

##### 
#111$
#11$@
#11$$
###2$
__#2$
__#$

When the solver is equipped with such a "combined-PI-corral" detection, the pruning effect is remarkable, much more than I originally expected, and empirically the savings always outweight the extra calculation.

What I find astounding about this method is that its effects are two-fold:

  • It limits the branching factor
  • It finds many deadlock situations

The fact that it finds deadlock situations is a side-effect of successor-pruning. If there is a (smallish) deadlock situation on the board consisting of a PI-corral, then pruning leads to a position where the legal moves dries out, in effect finding a deadlock like in this (very small) example:

####
#_$
#_$
#$

My program uses precalculated deadlocks to handle simple deadlocks like this one, but the corral-pruning is a valuable contribution to the program, and experiments shows that the corral-pruning almost can hold its candle to the precalculated deadlocks, if it is asked to do so:


Statistics - SokEvo/107 - Forwards search - no packingorder - AMD 2600+

A. A* search + transposition table: 11974023 positions 51938064 pushes 65 seconds

B. A* search + transposition table + corrals: 4632203 positions 11918161 pushes 21 seconds

C. A* search + transposition table + precalculated deadlocks: 2030736 positions 7011203 pushes 11 seconds

D. A* search + transposition table + precalculated deadlocks + corrals: 1231376 positions 3299797 pushes 7 seconds


"B" is the experiment with corrals only, without precalculated deadlocks, and its 21 seconds (on my amd 2600+ pc) is in the same league as the "C" experiment with precalculated deadlocks but no corral-pruning.

"D" with all currently implemented pruning techniques reduces the total solving time to 7 seconds, showing that the different pruning techniques work well together.

PI-Corral examples

... I wouldn't say that PI-Corral takes the better moves and prunes the worse moves. To me, it makes more sense to say the complete opposite. PI-Corral pruning heads down the alleys where there is a chance of proving that there is a deadlock. It does so by pruning all the better moves that the program doesn't see any problems with.

Here are a few more explanations.

Since the function that calculates the player's reachable squares (CPRS) is such a vital part of a Sokoban solver, and since it also has a crucial role to play in the PI-Corral algorithm too, it must work in a very special way:

  • CPRS must, of course, flag the squares the player can reach. It does so by setting a timestamp, which increases (by at east 2) on each call.
  • CPRS must also flag the boxes the player has access to. This speeds up the node-expansion anyway, but it is crucial for the PI-Corral pruning because it must know the boxes on the border of a corral in order to work. In my implementation, the bordering boxes are marked by the current timestamp + 1.

So to recapitulate:

  • reachable floors: timestamp
  • reachable boxes: timestamp + 1

Typically, there are more than one corral on the board, and often there are adjacent corrals. In other words, there can be several "pockets" on the board that the player cannot reach.

If PI-Corral pruning only could handle single corrals, there would only be a very limited number of cases where it was applicable, like this one:

Figure 1

##########
#._*__@
#__$__
#_$___
#*____
#_____

If there was one more "pocket", the program could not prune anything if it was restricted to single-pocket analysis.

Figure 2

##########
#._*__@
#__$__
#_$$__
#*_$__
#_$$__
#_____

Here there is an extra (single-cell) pocket. Incidentally, this is not a deadlock, but that's not the point. The point is that the PI-Corral algorithm proves that in this position it suffices to generate moves for the boxes in the combined corral, and all other boxes on the board don't need to be considered.

Here is how the program comes to that conclusion. Scenario: The program is about to expand this position. It calls CPRS to calculate the player's reachable squares and to find the reachable boxes as well.

One of the reachable boxes is the one on the topline, and even with the limited "vision" that CPRS provides, the program "knows" that it cannot reach the other side of that box because the timestamps are different for the floors to the right and to the left of the box on the topline. In other words, there is a "pocket" on the other side of the box in the topline that the player cannot reach.

Therefore, the program calls the PI-Corral function in order to analyse the pocket. I'm afraid the pseudo-code doesn't state the very first thing that the PI-Corral analyser must do, namely finding the inner squares and the bordering boxes.

This is accomplished by using CPRS again, but this time it's fed the inner square to the left of the box in the topline, *not* the player's position that was used for the normal calculation of the player's reachable squares.

The PI-Corral analysis is recursive, and it will detect the neighbouring 1-cell pocket and add it to what it calls a "combined" corral.

If the PI-Corral function fails, the returned "lowwater" timestamp hasn't been changed, meaning that the search must explore all the boxes reachable for the player (which was calculated by the first call to CPRS for this position.

If, on the other hand, the PI-Corral function succeeds like it does in Figure 1 and 2, the returned "lowwater" timestamp has been raised, indicating that some of the originally reachable boxes don't need expansion. It's only the boxes participating in the (possibly combined) corral that need to be expanded.

The interesting question is which positions the PI-Corral analysis cannot deal with. Unfortunately, there is a large class of positions, where there is a corral that potentially is a deadlock, but where the PI-Corral analysis cannot narrow the search down towards that path.

This class of "near misses" is all the positions where an outside box precludes the immediate pushing of one of the corral-boxes. Here is an example, which is very typical:

Figure 4

##########
#._*__@
#__$$_
#_$$__
#*_$__
#_$$__
#_____

The only difference from the example above is the added box to the right of the corral, in the third row. Now it isn't possible to push ALL the corral-boxes inside in the next push.

Maybe this is the place to recapitulate that the "I" in PI-Corral refers to the restriction that the boxes in the corral must be pushed INTO or INSIDE the corral only. In other words, the following corral isn't an PI-Corral because one of the boxes can be pushed to a square outside the pocket:

Figure 5

#######
#__$__@
#___$__
#__$___
####

Most of the code in the PI-Corral algorithm deals with detection of whether the boxes can escape, like in figure 5, or if the boxes only can move inside the corral in their next push (that is not the same as saying that they cannot escape in a subsequent push.)

One obvious and interesting topic for research is what to do with scenarios like the one depicted in Figure 4. Even though it's not a candidate for PI-Corral pruning, the program may have spent a considerable amount of time to come to that conclusion.

If the program at that time has gathered enough information to know that this almost is a candidate for PI-Corral pruning, it may be worthwhile to perform some sort of deadlock detection analysis, checking if the corral-boxes are trapped. I haven't made any experiments in that area myself yet, but it's surely is an interesting topic. (Later, a related "hindsight deadlocks" mechanism has successfully been added to the program. See "Text 9 Dynamic deadlocks - a hindsight approach".)

Also, there are some interesting possibilities for caching such information, in effect building a dynamic database with "yes" and "no" answers about the deadlock status for recurring box-constellations that the program runs into several times during the search.

PI-Corral pruning pseudo-code

Some material about PI-Corral pruning...

  • Statistics for a test of the 107 small SokEvo levels
  • Pseudo-code for the algorithm

The pseudo-code is slightly dated and probably inaccurate, so it should not be implemented 1:1, but reading it is a better way to get acquainted with the mechanism than reading the raw Pascal code in YASS.

The algorithm is complex and can only be understood by drawing a lot of small sketches while reading the code, but once understood and properly implemented, the mechanism makes a huge difference in the number of generated pushes, so it's worthwhile and rewarding to implement it.


Statistics - SokEvo/107 - Forwards search - no packingorder

A. A* search + transposition table: 11974023 positions 51938064 pushes 65 seconds

B. A* search + transposition table + corrals: 4632203 positions 11918161 pushes 21 seconds

C. A* search + transposition table + precalculated deadlocks: 2030736 positions 7011203 pushes 11 seconds

D. A* search + transposition table + precalculated deadlocks + corrals: 1231376 positions 3299797 pushes 7 seconds

It's astonishing that a rather general mechanism like corral-pruning (B. 21 seconds) is nearly as effective as a highly specialized mechanism like the precalculated deadlocks (C. 11 seconds).


PI-Corral Pseudo-code - may be slightly inaccurate

Don't put too much weight on the depicted return-value scheme at the end of the code. It's merely a schetch, and I'm not sure it's absolutely correct.

..CorralPruning(
....in: RecursionDepth__, (0 for base-call)
....in: FloorSquare__,
....io: LowWaterTimeStamp__, 
....out: HasAllBoxesOnGoals__,
....out: IsACombinedCorral__)
..
..
..starting from 'FloorSquare__', calculate corral floors and boxes
..(floors are marked with 'timestamp', boxes are marked with 'timestamp+1')
..
..if the timestamp wrapped around then
....fail
..else
....T = current box timestamp
....HasAllBoxesOnGoals = TRUE
....IsACombinedCorral__ = FALSE
....
....for each box B do
......if box B (still) is a member of this (single) corral then
........if the box isn't on a goal-square then 
..........HasAllBoxesOnGoals = FALSE
........
........for each direction do
..........N = neighbouring square in this direction
..........O = neighbouring square in the opposite direction
..........if N is a floor outside the current (single) corral 
.............AND
.............O is an empty floor where it's legal to place a box then
.............
.............if O is inside the combined corral then
...............if player can reach N then
.................if O is inside this (single) corral then
...................ok
.................else (remember that this corral depends on the combined corral)
...................IsACombinedCorral__ = TRUE
...................ok
...............else
.................if N is inside the combined corral (floor or box) then
...................IsACombinedCorral__ = TRUE
.................else 
...................if N is an empty floor (no box) then
.....................if CorralPruning(RecursionDepth__+ 1, N, LowWaterTimeStamp__, HABOG,  IsACC) then
........................if IsACC then
..........................HasAllBoxesOnGoals__ = HasAllBoxesOnGoals__ and HABOG
..........................IsACombinedCorral__ = TRUE
........................else
..........................LowWaterTimeStamp__ = newest box-timestamp
..........................HasAllBoxesOnGoals__ = FALSE
..........................IsACombinedCorral__ = FALSE
..........................RETURN from function with Result = True
.....................else
........................fail
...................else
.....................add the outer box at N and all of its connected outer boxes to the  corral
.............
.............else (meaning O is outside the combined corral)
...............if N is an empty floor (no box) then
.................if player can reach N then
...................fail
.................else
...................if N is inside the combined corral (timestamp > LowWaterTimeStamp__)  then
.....................IsACombinedCorral__ = TRUE
...................else
.....................if CorralPruning(RecursionDepth__+ 1, N, LowWaterTimeStamp__, HABOG,  IsACC) then
........................if IsACC then
..........................HasAllBoxesOnGoals__ = HasAllBoxesOnGoals__ and HABOG
..........................IsACombinedCorral__ = TRUE
........................else
..........................LowWaterTimeStamp__ = newest box-timestamp
..........................HasAllBoxesOnGoals__ = FALSE
..........................IsACombinedCorral__ = FALSE
..........................RETURN from function with Result = True
.....................else
.......................fail
...............else (meaning N contains a box)
.................if box at N is outside the combined corral then
...................add the outer box at N and all of its connected outer boxes to the  corral
.................else IsACombinedCorral__ = TRUE
....
....(at this point, the corral can only be opened by using inwards pushes)
....
....if RecursionDepth__ = 0 then
......if HasAllBoxesOnGoals__ then
........fail
......else
........if LowWaterTimeStamp__ < T then
...........LowWaterTimeStamp__ = T (this (possibly combined) corral is the pruning  candidate)
....else
......if HasAllBoxesOnGoals__ then
........IsACombinedCorral__ = TRUE  (otherwise the corral falsely returns to the toplevel)
......else
........if not IsACombinedCorral__
..........AND
..........LowWaterTimeStamp__ < T then
..........LowWaterTimeStamp__ = T (this corral is the pruning candidate)
....
....RETURN with Result = TRUE
..
..end CorralPruning

Usage:

..if CorralPruning(0,CorralFloorSquare, io: LowWaterTimeStamp,_,_) succeeds then 
....boxes with timestamps >= LowWaterTimeStamp belong to the (combined) corral

Putting a game-state on the board

> I was looking at your transposition table again today and noticed that
> you [in YASS] don’t store the box positions in the position record.
> I have just seen your routine “SetPosition” which
> explains this. What an interesting way of doing things!

Yes, I'm also quite pleased with that piece of code myself. The strange thing is that I've never seen exactly this type of code before. Most text book examples ("toy programs") tend to have so small domains that they can afford to store the complete game-state along with each node in the transposition table. But that won't do for "real-world" tasks like Sokoban.

In search-algorithms based upon depth-first, there is no problem in getting from one position to another because the transition is handled by the recursion:

..DepthFirstSearch
....if this is a terminal node then
.......return a score for this node
....else
.......for each legal move M do
..........do move M
..........DepthFirstSearch
..........undo move M

The "do move" and "undo move" handles the transition from one game state to another, and the nodes are discovered and expanded at the same time, so here it's very easy to handle the transitions.

The situation is quite different when depth-first-search isn't feasible. When nodes are discovered at one time, but expanded later, it takes some work to put a given position on the board.

The easiest solution is, of course, that each node carries the complete game-state, but it's seldom practical to do it that way, because it costs far too much memory for all but the smallest search domains.

Therefore, in practice, there really aren't any other solutions to the problem than the one I've implemented in "SetPosition". (Btw: the name may be bad; English is clearly not my first language, so suggestions for a better name are more than welcome.)

I've never seen this code in other programs, but it's not exactly rocket-science, so there must be a lot of programs out there doing the exact same thing:

..SetPosition(newPosition)
....backtrack:
......from: the current position
......to: the common ancestor of the current position and the new position
....move forward:
......from: the common ancestor
......to: the new position

What surprised me is that it seems to work reasonably fast in practice, despite of the necessary bookkeeping, and even despite of the pointer-reversal I use (the parent field) in order to avoid wasting 4 bytes on a forward link.

My guess is that the tree tend to be explored with some degree of locality, and therefore, backtracking seldom moves more than a few steps backwards, before progressing to the new position.

Maybe another important contribution to the speed is that since nodes probably are scattered all over the memory, then pointer-reversal may actually be essentially free. The nodes must be loaded into the cache anyway, so it doesn't matter spending a little time on pointer-reversal, given that it takes place in the already loaded memory lumps.

Calculating player's reachable squares + packing-order search

Goal-oriented search

> So the tree would have the normal A* fringe but with bits poking out!
> What I did along these lines was build a function which could create longer moves,
> moving a box all the way to a goal square in fact. These moves are just
> put on the open list like all the other single step moves but will be at the
> front since their score must be superior (or at least no worse than) other moves.
> [...]
> So there’s a lot of processing going on there. It must be possible to do this more
> efficiently (e.g. without having to fully calculate man reachable squares every time).
> [....]
> So the program wold be running normal A* but then having a peek to see if
> a box (or multiple boxes) can be moved all the way to the end. If so,
> it is moved there and things continue for a little while and then another
> check etc. I havent’ implemented this kind of control so have no idea how
> it would work.

What you've described here is essentially the mechanism I've implemented in YASS for it's rudimentary packing-order search (with 1 or 2 twists that makes it a great success), so I can warmly recommend that you work further along this line of thought.

The key observation is that it can be implemented so it reduces the amount of work rather than increasing it, i.e, there are fewer calls to the function that calculates the player's reachable squares. (I'll call the function CPRS() in the following.)

It's actually a good story, and I'd like to start telling it from a different angle. Most of it is, of course, well-known material, but even so, I've tried to write it as a coherent story that doesn't require too much previous exposure to Sokoban solver programming.


Combining depth-first search with the open-list in A*

> Yes depth first search is so appealing with the way it can move
> forward and back up because everything is linked.
> You just keep a few arrays of data around and this is all
> very fast and takes up a very small and constant amount of memory.

That's right! The good news is that a Sokoban solver can combine the more efficient depth-first search (DFS) with the open-list based A* search, and that's what I do in YASS with great success (I think; I base this on logical reasoning more than empirical measurements).

if we look at DFS for Sokoban, one of the first things we notice is that in order to generate the legal successors for a node, we need to know the player's reachable squares in the given position. So a first rough sketch of DFS may look like this:

..DFS1(Node)
....call "calculate player's reachable squares", that is, CPRS()
....for each legal move M
......do move M
......call DFS1(Successor)
......undo move M

Since CPRS() must use timestamps in order to avoid initializing the calculation on each invocation, it's already given that its output is stored in a global variable, not in a local area on the stack belonging to the current invocation of DFS.

It's also clear that if we want to generate the moves one at a time (like in the sketch DFS1), then the player's reachable squares must survive the recursive call to DFS1. This means we need a stack of global CPRS-arrays, one array for each recursive call to DFS1. So a more fleshed-out sketch looks like DFS2:

..DFS2(Node, Depth)
....call CPRS(Depth), meaning CPRS() delivers the reachable squares in array[Depth]
....for each legal move M
......do move M
......call DFS2(Successor, Depth + 1)
......undo move M

Each time we do a move in the search, we must first check if this position has been visited before. That is what the transposition table is for.

Sokoban has the very unfortunate property that after a move, we cannot directly look that position up in the transposition table. Before we can do that, we need to calculate a normalized (typically top-left) player-position, which in effect amounts to calling CPRS() an extra time. This is a costly operation, but there is nothing to do about that. The next sketch of DFS shows how it can be done:

..DFS3(Node, Depth)
....call CPRS(Depth), meaning CPRS() delivers the reachable squares in array[Depth]
....for each legal move M
......do move M
......
......call CPRS(Depth + 1), which as a side-effect calculates normalized player-position
......if boxes and normalized player-position represents a new position then
.........add the new position to the transposition table
.........call DFS3(Successor, Depth + 1)
......
......undo move M

This sketch reveals that there is an obvious improvement, so we can minimize the damage from the "get normalized player position" requirement: We have already calculated the player's reachable squares which we need for generating the successor positions in the recursive call to DFS3.

Therefore, we change DFS so it doesn't start by calling CPRS. Instead, it becomes a precondition that CPRS already has taken place for the given depth before DFS is called. This gives us the sketch DFS4:

..DFS4(Node, Depth)
....precondition: CPRS() has been called for "Depth"
....for each legal move M
......do move M
......
......call CPRS(Depth + 1), which as a side-effect calculates normalized player-position
......if boxes and normalized player-position represents a new position then
.........add the new position to the transposition table
.........call DFS4(Successor, Depth + 1)
......
......undo move M

That's a big relief! Instead of calling CPRS() twice - the most expensive commodity in the Sokoban solver - for each position, we are back at one call per position.

But this is for a depth-first search. For an A* search, nodes are not generated and expanded at the same time like in DFS, so in A*, the base algorithm forces us to call CPRS() twice for each position:

..A*1 (very rough sketch)
....put start position on the priority queue Q
....while Q isn't empty
......Node = most promising node on Q
......remove Node from Q
......
......set up the board so it matches the position represented by Node
......
......expand the node this way:
........call CPRS(0), which delivers the players' reachable square in array[0]
........for each legal move M
..........do move M
..........
..........call CPRS(1); side-effect: calculates normalized player-position
..........if boxes and normalized player-position represents a new position then
.............add the new position to the transposition table
.............add the new position to the priority queue, using the proper score
..........
..........undo move M

We notice that the node expansion part of the code is identical to DFS3, i.e., the un-optimized DFS that calls CPRS() twice.

Another observation is that in Sokoban, pushing a box will often leave the (simple score) unchanged. Making a push increases the simple score by 1. If the box moved closer to a goal, then this decreases the score by 1, so in effect, the score is unchanged.

For this group of pushes, there is no need to go through all the trouble of putting the positions on the priority-queue. We already know that its score entitles it to be at the front of the queue together with other nodes having the same score as its parent-position.

So instead of putting these positions on the A* priority queue we can expand them recursively exactly like DFS4, saving an invocation of CPRS for that position. This gives us A*2:

..A*2 (very rough sketch)
....put start position on the priority queue Q
....while Q isn't empty
......Node = most promising node on Q
......remove Node from Q
......
......set up the board so it matches the position represented by Node
......
......call CPRS(0), which delivers the players' reachable square in array[0]
......call expandNode(Node, 0)
..end
..
..expandNode(Node, Depth)
....precondition: CPRS() has been called for "Depth"
....for each legal move M
......do move M
......
......call CPRS(Depth + 1), which as a side-effect calculates normalized player-position
......if boxes and normalized player-position represents a new position then
.........add the new position to the transposition table
.........
.........if the new position has the same score as its parent then
............call expandNode(Successor, Depth + 1)
.........else
............add the new position to the priority queue, using the proper score
.........
......undo move M

Here the node expansion takes the score into consideration when it decides what to do with a new position: If the new score is the same as the parent score, then the new position is expanded immediately by a recursive call to "expandNode()", saving a costly call to CPRS()! If the new score is worse than the parent score, then the new position is put on the priority queue, just as normal for an A* search.

YASS has always done it that way. I can't remember how often this saving a call to CPRS() kicks in, but there is no doubt that it's an elegant and extremely valuable contribution to the search, in particular because it doesn't cost anything, neither in running time, nor in programming time or program complexity. It's a win-win situation.

It's the same mechanism which drives the packing order search in YASS.

Re-using the calculated player's reachable squares

> I spent a small amount of time writing a routine to decide whether the
> players reachable squares could be just updated with the current move
> in question or whether the full
> calculation needs to be done.
> I’m not seeing a straightforward answer as yet.

I'm happy to tell that there is a straightforward answer to that question! I found it some years ago by drawing the 16 situations that can arise after pushing a box. Matthias Meger then toppled it after having seen my drawings, and he boiled it down to a wonderful and extremely elegant algorithm:

..AFTER a push:
....if the player could *not* reach the new box-position BEFORE the push
.......OR
.......the box AFTER the push is surrounded by walls or boxes 
.......to the left and right (seen from the pushing direction)
.......THEN
.......it's possible to update the existing set of player's reachable squares
.......instead of calculating all the squares from scratch

This is a truly remarkable result, I think, at least theoretically. The downside is that it hasn't been possible to cash in on this insight.

The problem is that using the old player's reachable squares (from before the push) as basis for calculating the new ones after the push requires copying the contents of the old array to the new array, and in practice, this memory transfer turns out to be just as costly as recalculating the squares from scratch.

The statistics emitted by the console-program version of YASS show how often re-use would have been possible ("Re-use player's reachable squares?"), and as far as I recall, it's as much as 50% of the positions is a typical Sokoban game, so it's really sad that there hasn't been any progress in cashing in on this insight.

Below it an exerpt of the mail where I presented the idea and the drawings for Matthias Meger, based upon which he boiled it down to the algorithm presented above.


When a box is pushed 1 square, there are 4 neighbouring squares around it:

1$2 the box is pushed downwards
3_4

Each of the 4 neighbours can either be

  • empty squares, or
  • non-empty squares = walls or boxes

This means, there are 16 different states for the immidiate surroundings of the box:

4 squares x 2 states (empty/non-empty) = 16 different states.

The trick is that we can observe for each of these states, how the push opens and closes paths to the empty squares on the board.

When we enumerate all 16 states, it turns out (to my big surprise) that there are many of them, where it's not necessary to do a full calculation of the player's reachable squares. Instead, it often suffices to continue the calculation, based on the reachable squares before the box is pushed.

There are also a few states where it's not even necessary to continue the calculation. The existing calculation will do, except, of course, that the 2 squares with the player and the box must be updated. I've named that situation COPY in the listing below, but in practice, there is no need to implement it literally. It's easier and good enough simply to use CONTINUE in that situation too.

Here is the complete listing of the 16 states:

Legend:
|             : unspecified, that is, "don't care"
_             : empty square
@             : player
$             : box
X             : non-empty square, i.e., a box or a wall
?,%           : squares that need further analysis
1..4          : |@|   the 4 squares around the box; left-side = 1,3, right side = 2,4;
                1$2   when the numbers  are referred to in the table, 
                3_4   they are always empty squares.
CONTINUE      : start with current player's reachable square and continue calculation
COPY          : re-use current player's reachable squares, with small modifications
FULL          : full calculation required for finding the player's reachable squares
isReachable   : if TRUE, then the square is reachable before the push

The figures show the position BEFORE the box is pushed 1 square downwards

|@|      00: ____?
_$_      if isReachable(?) then FULL else CONTINUE
___
|?|

|@|      01: X___
X$_      if isReachable(3) then FULL else CONTINUE
3__


|@|      02: _X__
_$X      if isReachable(4) then FULL else CONTINUE
__4


|@|      03: XX__
X$X      if isReachable(3) then FULL else COPY
3__


|@|      04: __X_?
_$_      if isReachable(?) then FULL else CONTINUE
X__
|?|


|@|      05: X_X_?
X$_      if isReachable(?) then FULL else CONTINUE
X__
|?|


|@|      06: _XX_?
_$X      if isReachable(?) then FULL else CONTINUE
X__
|?|


|@|      07: XXX_?
X$X      if isReachable(?) then FULL else COPY
X__
|?|


|@|      08: ___X?
_$_      if isReachable(?) then FULL else CONTINUE
__X
|?|


|@|      09: X__X?
X$_      if isReachable(?) then FULL else CONTINUE
__X
|?|


|@|      10: _X_X?
_$X      if isReachable(?) then FULL else CONTINUE
__X
|?|


|@|      11: XX_X?
X$X      if isReachable(?) then FULL else COPY
__X
|?|


|@|      12: __XX
_$_      CONTINUE
X_X


|@|      13: X_XX
X$_      CONTINUE
X_X


|@|      14: _XXX
_$X      CONTINUE
X_X


|@|      15: XXXX
X$X      COPY
X_X

YASS 2.66 "war story"

Here in the weekend I had a very pleasant "peak moment" in my life as a programmer; one of those events that at best happens a few times in a life-time where a program suddenly makes a giant leap forward, as when a child for the first time begins to walk or to talk.

By practically changing one line of code, my YASS Sokoban solver program jumped from solving 27 to 43 (and later 46) of the levels in the 90-level XSokoban set! (The current YASS version solves 75 XSokoban levels.)

The XSokoban/90 set has been the standard test for Sokoban solvers since Junghanns wrote a thesis about his "Rolling Stone" Sokoban solver. The set comprises 50 original levels by Thinking Rabbit from the first release of Sokoban back in the 1980'es and 40 extra levels by various authors.

On the Sokoban Wiki, there is a comparison between some of the available solvers, e.g., BoxSearch, JSoko, Takaken and YASS:

http://sokobano.de/wiki/index.php?title=Solver_Statistics

YASS is not the best solver around, but overall I'm quite satisfied with its results.

All the XSokoban levels are goal-room themes levels, so packing order machinery is the most important part of a solver that aspires to rank among the best.

The Takaken program has by far the best packing order mechanism and it solves the astounding number of 86 XSokoban levels. For comparison, Junghanns' thesis program "Rolling Stone" solved ~55 levels after several years in the making.

On the Wiki, BoxSearch is currently credited for 42 XSokoban solutions, which also is a very good result. I'd say that 40+ is highly respectable for a solver program, and 60+ is "master class". Currently, YASS solves 75 XSokoban levels.

YASS prior to version 2.66, on the other hand, only scored 27 XSokoban solution in the test on the Wiki. That was just enough to avoid total embarrassment, but it was definitely nothing to be proud of.

Last week I took up solver programming again. I had always known that there were some levels in the XSokoban collection with a so simple packing order, that YASS' failure could not be the result of packing order problems, so I simply assumed it failed here because of deadlock situations.

I was working on improvements in the deadlock detection mechanism, and after a few days last week I had achieved a moderate success in that area. During the analysis of what was going on, I suddenly saw that YASS failed on some of the simple goal-room themed levels - not because of deadlocks - but because the program shot itself in the foot with a misguiding position evaluation function!

The evaluation function controls in which order the positions are selected for expansion, and it must ensure that the most promising positions always are analysed first.

The evaluation function I had implemented sounded deceptively correct. For goal room themed levels, a high priority is given to moves that "brings the box home", that is, moves that bring a box all the way to its final position from whatever position the box happen to be on the board.

In practice, the program simply counted such a move as a single push, no matter what the distance was between the current position of the box and the final goal. In effect, this gives such moves a bonus (a discount) in the evaluation function, so these moves are analysed and expanded first.

An example:

A box moves along the path from A to G: ABCDEFG, where G is the final goal cell. If getting the box all the way from A to G is a good thing, then the next best thing must be to get it to F. Therefore, the move A-G gets the highest bonus, and the move A-F gets a smaller bonus, but even so, the bonus for the move A-F is almost as big because it almost brings the box to the final destination.

I had fallen into exactly that logical trap! I suppose is all sounds deceptively correct when it's presented as I have done here, but in reality it's fatally flawed! It's only the full move A-G that should get a bonus. The other moves must not receive any form of bonus because it can lead to a "stampede" towards inferior positions.

A simple illustration of this is to say that the move ABCDEFG fills the first goal in a small goal-room area on the board. As an example, the score is 100 before making the move:

ABCDEFG
########
$__....# Score: 100
########

After the ABCDEFG move, the position is:

ABCDEFG
########
___...*# Score: 100
########

The score is still 100 because the bonus outweights the cost of the 6 pushes. The position after the move ABCDEF gets a slightly smaller bonus, so that position has a score of, say, 110: (smaller scores are better)

ABCDEF
########
___..*.# Score: 110
########

Now let's turn to the situation after making the full move ABCDEFG. Clearly, this is the most promising candidate, so it's selected for expansion first. Often, the situation elsewhere on the board is so that the next moves won't improve the score. The boxes cannot move straight towards the goals, so the score gets worse before it can get better.

So after expanding the successors after the ABCDEFG move, it will often happen that their scores numerically are worse than the shorter "near-miss" ABCDEF move, and therefore, that position is suddenly on the top of the list and it's selected for expansion.

This can easily lead to positions where boxes "stampede" towards the inferior ABCDEF position like this:

ABCDEF
########
___.**.# 
########

and further:

ABCDEF
########
___***.# 
########

... simply because they numerically have a better score than all the successors of the full ABCDEFG position, and that was what happened in YASS prior to version 2.66.

This particular example is not entirely correct because it's a simle tunnel, so all successors of ABCDEFG will be just as good as successors of the shorter ABCDEF, but the simple example is good enough to illuistrate this "stampede" mechanism, which is there in the general case for non-tunnel goal rooms.

So by simply changing a single line in the evaluation function (paired with the contributions from improvements in the deadlock detection and the new tunnel mechanism by Matthias Meger), YASS suddenly could solve 46 instead of 27 XSokoban levels!

Solving 46 XSokoban levels is very respectable, I think, and it's worth noting that this result is achieved with only the most primitive goal room ordering mechanism, before the program got its more sophisticated packing order calculation, which - with good help from further improvements of the deadlock detection - brought the score up to 75 XSokoban solutions.

Dynamic deadlocks - foresight approach

The ideal situation for a Sokoban solver would be never to expand a position which has a deadlock in it. Unfortunately, this is not possible because answering the question "is this a deadlocked position?" requires the exact same search as the answer was supposed to avoid.

One of the things a solver can do to limit the number of nodes expanded along a deadlocked path is to analyse the "corrals", that is, the "pockets" on the board before committing to a more thorough search of the game-graph below the position selected for investigation.

When a position is selected for expansion, the program finds all the corrals on the board and checks whether they are deadlocks. The deadlock/not-deadlock status for each corral is stored in a corral-database, so the program only needs to calculate the status once for each corral. Even so, it's a costly operation, but because avoidance of deadlock situations is at premium it pays off, at least asymptotically.

To speed up the this corral analysis, it only calculates the deadlock status for corrals with 1 or 2 rooms, and with a limited number of boxes. Furthermore, the corral-database doesn't register the deadlock/non-deadlock status for the corral. Instead, the corral-database only registers whether the corral has been calculated or not.

That way, an exact corral identification is unnecessary, so lookups only require a simple hash-key and not the exact box squares. False positives and negatives only affect the decision about calculating the deadlock status; they don't affect the answer to the "is this corral a deadlock?" question.

Analysing all corrals on the board for being deadlocks before a game state is expanded is a foresight approach. It's costly, so it's a welcome contribution to the dynamic deadlock analysis that it can be supplemented with a hindsight approach, which is quite cheap.

Dynamic deadlocks - hindsight approach

Deadlock detection in YASS is based upon deadlock situations where each deadlock situation is represented by a capacity, and each time a box moves around on the board, the capacities for all deadlocks are updated. This entails that the cost of the deadlock detection for a push is proportional to the number of deadlocks that the "from-square" and the "to-square" are involved in. Practice has shown that this cost is very reasonable up to at least 1000 deadlocks per level, but as could be expected it doesn't scale well with more deadlocks than that.

At first, YASS only had its precalculated deadlock situations, which are calculated when the level is loaded before the search starts. Precalculating deadlocks works fine, but it cannot find all the deadlock situations which occur during the search. The hindsight approach was the first of the dynamic deadlock calculation mechanisms added to the program.

For dynamic deadlock analysis, it comes in extremely handy, that YASS has implemented Matthias Meger's ingenious PI-corral idea. Briefly explained:

  • Corrals are "fences" on the board built by a set of boxes.
  • I-corrals are fences where all participating boxes only can be pushed inwards in the next push.
  • PI-corrals are I-corrals where the player has access to all the outer boxes in the corral, so all their inwards pushes are legal moves in the current position. In other words, no outside boxes on the board blocks the player's access to the corral boxes.

When there is a PI-corral on the board, the solver can prune all other box-pushes. The search narrows down to the corral boxes only. This follows from the facts:

1) The corral has to be opened sooner or later anyway (provided not all boxes are located at goal squares, of course).

2) Pushing a box inwards, into the corral, doesn't interfere with the ability to push outside boxes later. (This entails that PI-corral pruning even has the extremely pleasant property that it doesn't interfere with the push-optimality of a solution.)

The PI-corral calculation offers an elegant and relatively easy way to add new deadlock situations to the capacity-counted deadlocks in YASS.

When the program selects a position for expansion, it will sometimes be the case that there are no legal successor pushes. In other words, the position is a deadlock, and moreover, it means that the deadlock detection has been inadequate because otherwise, the position would already have been labelled as deadlocked and the position would not have been selected for expansion.

Running into such a dead-end gives the program a chance to learn something, and moreover, it's almost cost-free. The program expanded the position because it didn't know better, and it has spent the time expanding it, so any knowledge that can harvested now is a very welcome bonus for all the trouble.

It has always puzzled me why the Sokoban literature and the publically available programs don't seem to exploit this seemingly valuable cost-free piece of information. A closer examination shows, however, that the information isn't as valuable as it may look at first sight.

The problem is that just because there are no successors to a position, it doesn't really mean there is a deadlock. It's more correct to say that it's a dead-end, and not necessarily a deadlock.

For instance, it's a dead-end if no successor positions were generated - not because there weren't any legal pushes - but because better paths to the resulting positions already have been found.

After filtering this type of dead-ends out, one might think that at least dead-ended positions with a PI-corral must be deadlocks, but this isn't the case either. A PI-corral only means that the legal pushes can be narrowed down to the corral boxes. An example, showing only an excerpt of a board:

__########
__$______#
___$_____#
__########

It can easily happen that the search algorithm reports that opening the corral by pushing the bottom box inwards to this position is a deadlock:

__########
__$______#
___@$____#
__########

The reason is that due to transpositions, this position may already be known to be a deadlock, not because of the 2 depicted boxes, but because of the situation elsewhere on the board.

So the bottom line is that a PI-corral isn't necessarily a deadlock just because no legal successors were generated during expansion of the position. The PI-corral may be a deadlock, but the program cannot know that without analysing it, and what we're aiming for at the moment is cost-free (or more correctly low-cost) deadlock detection.

There is one small class of PI-corral left, where we can be sure there is a deadlock without further analysis: If there are no legal pushes according to the rules of the game and according to the already calculated deadlock situations.

In practice, this small class of PI-corrals turns out to be a most welcome addition to the precalculated deadlock situations in YASS.

To sum up, here is what is going on:

  • The PI-corral detection narrows the search down to a set of fence boxes, if there is a corral of this type on the board.
  • If there are no legal pushes, the position is a dead-end. When the position contains a PI-corral, the program is aware of the fact that the dead-end is caused by the boxes which are members of the corral, and this set of boxes is (almost) guaranteed to constitute a deadlock situation that it will be worthwhile to add to the database.

So the program can be almost certain that it's not wasting its time if it tries to add a new deadlock situation to the database based on the PI-corral boxes.

In practice, it has turned out that it's necessary to limit these dynamically created deadlock sets so only candidates with 5-6 boxes are added. Otherwise, the database blows up with too many special-case positions, rather than general deadlock situations that occur in a larger number of positions.

The "Original 9" level is a good example, showing how it works and how much it means to YASS having these dynamically calculated hindsight deadlocks in the program. Previously, the program couldn't solve the level, and the best position it found was this one:

Original 9 by Thinking Rabbit, deadlocked position

__________#######
__________#__.**#
______#####__.**#
______#______._*#
______#__##__.+*#
______##_##__.**#
_____###_########
_____#___$_######
######____$_#####
##___#___$$_#___#
##_________$__$_#
######______#####
_____#______#____
_____########____

It's easy to see what has happened here. The packing order mechanism has brought most of the boxes home. The deadlock in the bottom-right part of the board is not a PI-corral, unfortunately. It's only an I-corral because there is and outside box (the left of the 2 "$$" boxes), and therefore, the search didn't focus on the corral boxes but allowed the packing order search to go ahead and bring all the other boxes to their final positions.

All this doesn't change by having the new dynamic deadlocks in the program. The program will still be fooled by the deadlock and generate all the pushes leading to the depicted position, but the new important difference is what happens next. After the next push, where the program pushes the outside box away, this position occurs on the board:

__________#######
__________#__.**#
______#####__.**#
______#______._*#
______#__##__..*#
______##_##__.**#
_____###_########
_____#___$_######
######____$_#####
##___#___@$_#___#
##_______$_$__$_#
######______#####
_____#______#____
_____########____

Both with and without dynamic deadlocks, the program now discovers the PI-corral, and consequently, that this is a dead-end and a deadlock.

Without dynamic hindsight deadlocks, that had no impact on the search. All the packing-order pushes along the path have almost as good a score as the first depicted position, so they would be expanded next, without realizing that they all contain a deadlock, so in effect the program wasted all its time searching futile paths.

With the new hindsight deadlock mechanism, this corral is now added to the deadlock-database, and no positions are expanded if they are deadlocked according to this new deadlock set. The result is that YASS solves the level in a few milliseconds.

It's important to notice that adding a new dynamic deadlock set has no immidiate consequences for all the stored positions. It would be an overwhelming task for the program to investigate all stored positions to see if they are deadlocked according to the newly added deadlock set. Fortunately, stopping the search and performing such a full recalculation isn't necessary.

All that is needed is a small modification of the code which puts a position on the board when it's selected for expansion. Previously, the program could be sure that a position never was a deadlock according to the deadlock-database, so when it performed the moves along the path to the currently selected position, it updated the capacities of the deadlock situations without further ado. A very simplified sketch is like this:

..for each push along the path to the currently selected position do
....update deadlock sets capacities based on the "from" and "to" square

The new situation is that a position selected for expansion may very well be deadlocked. The program must ensure that the position isn't really expanded.

This can be accomplished by letting the code monitor the capacity updates:

..for each push along the path to the currently selected position do
....for each deadlock capacity that is updated do
......if this is a precalculated set then update as usual (it never deadlocks)
......else
.........if it's a deadlock after the capacity has been updated then
............set a "deadlocked!" flag

The top-loop code for an A* algorithm typically looks like this:

..while the OPEN queue isn't empty
....retract the most promising node from the OPEN queue
....put it on the board
....expand the node, i.e., generate its successors

The only necessary modification here is adding a test to see if the position turned out to be a deadlock, in which case the position isn't expanded:

..while the OPEN queue isn't empty
....retract the most promising node from the OPEN queue
....clear the "deadlocked!" flag
....put the position on the board
....if the "deadlocked!" flag wasn't raised then
.......expand the node, i.e., generate its successors

So the runtime cost for having the new "hindsight dynamically created deadlocks" in the program is roughly one extra "if" per updated capacity per push when a position is selected for expansion. That's not bad, and it works very well in practice.

There is one little twist to this mechanism that is worth mentioning from a practical programming point of view.

The code that puts a position on the board (called "SetPosition" in YASS) walks up and down the paths among the positions stored in the transposition table.

When a new dynamically created deadlock is added to the database, there is no guarantee that the following positions selected for expansion need to backtrack and proceed along a new path that triggers the "deadlocked!" flag, even if the positions really are deadlocked, and the capacity of the offended deadlock correctly indicates so. The program just doesn't know that because it would, of course, be a terrible waste if the program had to consult each capacity to see if it is a deadlock.

Example: A new dynamically created deadlock set is created after the pushes "abcdefgh", but in reality, the deadlock situation already occured after "abcd". The program just didn't notice it before "abcdefgh".

Now let's say the next position selected for expansion (after having discovered the deadlock) is the position after the pushes "abcdexyz". To put that position on the board, the program backtracks from "abcdefgh" to the common root "abcde", and then proceeds with "xyz". This is a deadlock, but the flag wasn't triggered because "SetPosition" didn't backtrack to the point "abcd" where the flag would have been raised.

To ensure that the flag is raised, the program needs raising yet another flag when a new dynamic deadlock is added to the database. The main A* loop uses this extra flag to trigger a full backtrack to the starting position, before "SetPosition" proceeds towards the position selected for expansion:

..while the OPEN queue isn't empty
....retract the most promising node from the OPEN queue
....
....if the "backtrack!" flag is raised then
.......put the start position on the board
.......clear the "backtrack!" flag
....
....clear the "deadlocked!" flag (see ERRATA below)
....put the position on the board
....if the "deadlocked!" flag wasn't raised then
.......expand the node, i.e., generate its successors

The 3 new lines starting with "if the backtrack! flag..." completes the algorithm. In the example, the program backtracks all the way back to the starting position, before it proceeds with all the moves "abcdexyz" leading to the selected position, and after "abcd" the "deadlocked!" flag is raised, and the main A* loop notices that the position should not be expanded.

Until now I've presented it as if adding the "no-legal-pushes PI-corral situations" as new deadlocks is a "no-brainer" for the program, and that it doesn't require an analysis.

This is not entirely true because of the the practical limitations in the implementation of the capacity-counted deadlocks, most importantly that the deadlock situations must be 1) independent of the player position, or 2) limited in such a way that the direction of the last push is enough to infer that the player is outside the boxes.

Therefore, the program performs the same thorough investigation of dynamic deadlock candidates as the precalculated ones, proving that the candidates really are deadlocks that can be added to the database. Since a dynamic candidate only is calculated once, and since it in most cases is a valid candidate, there is very little time wasted by validation of the dynamically created candidates.

A final twist is that it's possible to soften the "no-legal-pushes-at-all" constraint for the dynamic deadlock situations. Often, when there are no legal pushes in a deadlocked position, then the position has been painting itself into a corner for several pushes. In that case, YASS adds all the positions along the doomed path as new deadlocks.

Here is a small example from the level "Sophia" by Lee J Haywood. First the level:

"Sophia", by Lee J Haywood
#######
#     #
#@$.# #
#*$  .#
# $$  #
# . . #
#######

The next figure depicts a 4-box deadlock:

#######
#_____#
#%__#_#
#_%___#
#_%___#
#_%__ #
#######

The box at the bottom started painting itself into a corner at the time it reached the neighboring square:

#######
#_____#
#%__#_#
#_%___#
#_%___#
#__%_ #
#######

These 2 examples, together with the one in "Original 9", are good examples of what type of deadlocks the new hindsight mechanism adds to the database.

As a final note, YASS doesn't attempt to "razor" the dynamically created deadlock sets because it's believed that the time cost isn't worth it. An example where razoring could improve the quality of a deadlock situation is this one from the level "Abigail" by Lee J Haywood:

########
#____%_#
#____#_#
#____%_#
#____%%#
#_#__#_#
#_____%#
########

2 boxes could be razored, so there only are 3 remaining boxes in the deadlock:

########
#____%_#
#____#_#
#____%_#
#_____%#
#_#__#_#
#______#
########

The hindsight approach to dynamic deadlock detection is, of course, not a panacea. It may even be partially redundant when a solver also uses the foresight approach, performing a corral-deadlock analysis before it expands a position. Even so, the appeal of the hindsight mechanism is that the cost is so low during the search.

ERRATA:
Please note that the description of the bookkeeping in the A* main loop is flawed or inaccurate. I wrote the text from the top of my head without consulting the code, and it turns out that I had forgotten some crucial issues.

The code in the program is correct, and an implementation of the mechanism should be based upon the code and not this text. The key issue is that what I presented as a "deadlocked!" yes/no flag in reality is a "deadlockedCount" variable, counting how many pushes along the path to the current position that have triggered a dynamic deadlock.

Personal tools