DHolland S3

From Sokoban Wiki

Revision as of 09:48, 3 April 2010 by Briandamgaard (Talk | contribs)
Jump to: navigation, search

Section 3 Algorithms

========

In this section I attempt to make precise (programmable) some of the ideas in the previous sections.

Suppose that there are N goals. Let P be a Sokoban position.

Isolated square (definition)


A square in P is "isolated" if the eight adjacent squares (horizontally, vertically and diagonally) are all empty.

A-empty (definition)


A square is P is "A-empty" (access empty) if it is either empty (possibly containing a goal but no box) or contains the man. Squares that are not A-empty contain either boxes or walls (possibly boxes on goals).

Man access corral (definition)


A dictionary with keys squares S belonging to P and values -1 (if the man has no access to S) or 1 (if the man does have access to S). Can be implemented iteratively or recursively, starting with the man's square and moving to adjacent squares horizontally and vertically. More generally, external corrals could enter into the dictionary with values of integers above 1 (one for each corral). The A-empty squares in a corral are called the "corral access".

BD: Implementation-wise, it wastes time doing it this way, where the calculation is required to initialize all squares on the board with "-1" (or better with "0") before the flood-filling calculation of the reachable squares. Instead, it's better to use timestamps. This requires that the area reserved for the calculation is global. Each invocation of the calculation increases the timestamp value, and then all the reachable squares are flagged with that value, thereby making the reachable squares distinguishable from the rest. That way, the squares in the global area only need to be initialized to "0" before the first calculation and when the timestamp overflows


One-box maze (definition)


The maze M was defined in Section 2 (delete all the boxes from P to get M). For a square S of P which does not contain a wall, the one-box maze for S is the Sokoban position OBM(S) obtained by adding to M a box on square S.

Dead squares (definition and algorithm)


These are the A-empty squares from which a box can never be pushed to a goal. They are found by iterating over A-empty squares S, forming the one-box maze OBM(S), and computing the push access PA(S,PS) for each push square PS of S in OBM(S). If none of the PA(S,PS) contain goals, S is a dead square. This algorithm and concept is well-known but is reproduced for familiarity of the concepts in other algorithms below.

Frozen boxes


A box B such that: (a) no pushes of B are possible due to blocked push squares (at least one of the left and right squares is blocked by a box or wall, and at least one of the up and down push squares is blocked) and (b) Each box adjacent to B on a push square such that the opposite push square is not a wall satisfies (a) and (b) in place of B is "frozen". Evidently, frozen boxes can never be pushed again and a frozen box not on a goal is a deadlock. This is the familiar, recursive definition (apologies if I oversimplified it), and generalizing is problemmatic for boxes with pushes since pushes which currently lead to deadlock might not do so after other boxes have moved. Even so, here is a generalization. A box B is "corral deadlock frozen" if either it is frozen or every push P of B satisfies either: (c) After deleting all known frozen or corral deadlock frozen boxes, except B, P is a detectable deadlock, or (d) B is a boundary box of an external corral C. With all other boxes outside C removed, the push square to perform P is man-accessible and after P, the position is a detectable deadlock. Moreover each of the other boundary boxes of C satisfies (d) in place of B. Note that the definition is iterative as we start with frozen boxes and may uncover some corral deadlock frozen boxes with each iteration, so we may need to check each box several times, each time the number of deadlock frozen boxes increases. As well as extending corral deadlock detection one ply, it also applies to frozen corrals with all boxes on goals, which is not a deadlock, but may enable detection of deadlocks as other boxes are pushed adjacent to the corral.

The definition of frozen extends to reverse mode in an obvious way. Pull frozen arrangements often have boxes two pull squares apart and are often a feature (obstacle) in stacking problems. Corral deadlock detection also extends to reverse mode in an obvious way, with the man inside the corral unable to pull his way out. With goals moved to the start position one must exclude from deadlock arrangements with all boxes on goals exactly as for the forward mode. Further note regarding frozen and corral deadlocks that straight tunnel sections can be treated as one square long to generalize.

Reverse exclusion zone and push-reachable zone (definition and algorithm)


In trying to solve a puzzle in reverse by pulling boxes, we must avoid dead squares, since a position reached in reverse with a box on a dead square can never be reached from the start position by pushing boxes (since we exclude the detectable deadlock of pushing a box onto a dead square). In other words, allowing this would prevent the forward and reverse searches from being able to meet. We can generalize this idea to the "reverse exclusion zone". In addition to the dead squares, the reverse exclusion zone is defined to include all A-empty squares S (in the start position SP) such that S does not belong to the SP "push-reachable zone". The algorithm for determining the push-reachable zone PRZ for SP is as follows. Let PRZ be a null list of squares initially. Iterate over the squares B in SP containing boxes. Let OBM(B) be the one-box maze for B. Compute the push access PA(B,PS) for each push square PS of B in OBM(B). For each square S2 in PA(B,PS), if S2 is not in PRZ add S2 to PRZ. Terminate returning PRZ when we've finished all the iterations. A typical example of a square in the reverse exclusion zone which is not a dead square: consider a T-intersection whose single exit (not the paired exits in a line) turns into a bent tunnel which has another exit. Assume the first square S in the bent tunnel adjacent to the T-intersection is A-empty in the start position SP, and that the T-intersection and adjacent section of the bent tunnel (including the corner) contain no goals. Because of the T-intersection, it is impossible to push a box onto S (a wall is on the push square). Yet, in a suitable OBM it is possible to pull a box from the T-intersection onto S, and because of the bent tunnel's other exit also possible to pull a box from elsewhere onto the T-intersection and change the man position ready to pull it into the bent tunnel. So S belongs to the reverse exclusion zone without (necessarily) being a dead square.

Goal trap (definition and algorithm)


A trivial example of a goal trap is a goal in the corner of a room: once a box is pushed onto the goal it is frozen ("trapped"). The notion of goal trap generalizes this idea to the case where the box is not frozen, but cannot reach all of the push-reachable zone PRZ for the start position SP. To determine the goal traps, iterate over goals G in SP. Form the push access PA(G,PS) for each push square PS of G with a push from G in the one-box maze OBM(G). Merge these push-accesses for G into a list GT(G) (the goal trap). Whenever GT(G) is just PRZ it conveys no further information. However, if GT(G) contains fewer than N goals (say N2 goals), this means that any position with more than N2 boxes on squares in GT(G) is a deadlock, since only N2 of these boxes can ever be stacked (on the N2 goals in GT(G)); the remaining extra boxes cannot reach any of the remaining goals since they can only reach squares in GT(G). A typical and simple example would be a goal on the wall of a room where that wall contains no exits and meets directly with two corners of the room (so the whole wall is entirely horizontal or entirely vertical). It is okay to push one box onto this wall, as it could later be stacked on the goal. But it is no good to push two boxes onto this wall, since only one can be stacked on the wall's goal, and the other is then stranded in the goal trap and cannot reach any of the other goals in the puzzle. Thus, the numerical information conveyed by the goal traps is the number of goals they contain, and the implied restriction on how many boxes can be in the goal trap at any time without causing detectable deadlock. Note that corner goals have empty goal trap, conveying no information, but goal traps in general can exist with higher capacity but without containing all N goals.

The goal traps may also convey restrictions for the reverse search. If there are NB (more than zero) boxes inside a goal trap in the start position SP, the reverse search cannot be allowed to remove from the goal trap so many boxes that there are fewer than NB boxes in the trap, otherwise the forward and reverse searches cannot meet. As an example, with the goal on a wall as described above, if the start position has a box on the wall the reverse search cannot be allowed to remove all boxes from the wall.

Reversible regions, sources and sinks


The notion of a directed graph (digraph) of push-reversible regions, with numerical restrictions on the number of boxes within each region, generalizes goal traps. The push-reversible regions are got by restricting push access in one-box mazes to squares to those with a push route back to the starting square. This provides non-intersecting reversible regions which contain all A-empty squares other than dead squares. The digraph is obtained by having a vertex for each reversible region and an arrow from region A to region B if and only if there is a push-route in a one-box maze from (any square of, taken to be the first by reversibility) A to B. A "sink" in a digraph is a vertex with no outgoing arrows (a "source" has no incoming arrows). Note that all sink regions must contain goals to avoid being dead squares. Thus if a sink region contains M goals, any position with more than M boxes in that region is a deadlock, generalizing the forward restriction in goal traps. More generally, for each sink S, count up the goals G in regions which have a route following the arrows in the digraph (hence a push route) to S (including S). Count up the boxes B in these regions. If B is less than G (as could happen if boxes are pushed to a region with no push route to S) the position is a deadlock. It may not be a deadlock if B exceeds G, provided G is less than the total # of goals N, because the extra B-G boxes might be pushed to other goal regions with no push route to S. Similarly, for a source S, if it has more goals than boxes the position is deadlocked. Perhaps the reverse search can have generalized restrictions - not sure what they should be.

These ideas of reversibility and digraphs to show directions for pushing are "static" in that they apply to one-box mazes though the numerical restrictions they imply can be applied to "dynamic" positions. Further dynamic restrictions could be applied once a stacking order is obtained; frozen boxes during stacking could subdivide the static reversible regions, where the frozen boxes are added to the one-box maze to find further restrictions. Such dynamic restrictions could signal "relocation" subtasks - moving a certain number of boxes from one dynamic region to another before a particular phase of the stacking is completed. This would be of great importance for a limited search as the success of the reverse search in finding a stacking order would then automatically generate these relocation subtasks, potentially enabling the solution of puzzles with too much complexity for conventional search. More on relocation later.

Note for conventional searches and Sokoban programs


Reverse exclusion and goal traps (perhaps even static reversible regions) should be implemented for offline deadlock detection as an aid to the human solver; whether goal traps should be included in a conventional computer solver (reverse exclusion certainly should be included in the reverse search) depends on the cost-benefit analysis for the solver - a judgement of how much the overhead of checking every position against the goal trap restrictions will slow the search. In my scheme of a non-exhaustive search, all three concepts would definitely be included because even lengthy deadlock detection can be valuable in a limited search; with some memory overhead (storing box locations within traps and reversible regions for each position and updating this data when boxes move) goal trap restriction can be implemented at relatively low time cost during search, though reversible region restrictions would require an offline computation of push routes between all regions that support them so the restrictions could be implemented at relatively low time cost during search.

Depth Corral (definition)


A dictionary with keys squares S belonging to P and values non-negative integers indicating the access depth. Unassigned squares have value -1. Implemented in following algorithm. More generally, called an "access corral" if the values have some other meaning (cf. RFS below).

IADOL (Iterative Access Depth Ordered List) algorithm


Parameters: P, dead squares DS in P, Man access corral M in P, depth D, depth corral DC, DL (depth list of squares belonging to M), TL (total depth ordered list of squares). Implementation: Null list of squares SL. Iterate over each square S in DL. For each square that is adjacent horizontally or vertically to S, and belongs to M, does not belong to DS, and whose depth in DC is unassigned, add that square to SL and to (the end of) DL. If SL is still null, terminate returning TL, DC and D. Otherwise recurse with depth D+1 and DL replaced with SL. Usage and interpretation: Initially called with DC values all unassigned, D zero and DL a list of boxes (intended to be boxes on goals in a solved position which can be pulled from the man's corral). Each square in DL is given depth 0 (value 0 in DC). On termination D is the largest depth found and TL a list of squares in M in order of increasing depth. The depth is interpreted as distance in man moves from all the boxes in the initial DL. Idea: aim to pull the boxes out as far away from the solved position as we can, avoiding dead squares; start by depth ordering empty squares.

CA (corral access for circuits) (definition of circuit and algorithm to compute the # of circuits and circuit access) algorithm


This algorithm also computes connected occupied regions with access lists.

Compute access corrals in a dictionary D for a Sokoban position P. Let Q be a copy of P. Let NCR be the number of connected regions of occupied squares, initially zero. Let NC be the number of circuits, initially zero. Let NOL be the list of occuped squares, initially null. Iterate over squares S of P, skipping those that are A-empty or already in NOL. When an occupied square S is found, add one to NCR, add S to NOL and find the connected occupied squares (recursively or iteratively, spreading out horizontally, vertically and diagonally from the initial occupied square to adjacent occupied squares) not already in NOL, adding each square to NOL and marking the square in Q with the number NCR and adding these squares to a list CRL(NCR) (connected region list for region # NCR). Set NAC (number of access corrals) to zero and a list CRAC(NCR) to null (the connected region access corrals for region # NCR). Now iterate over CRL(NCR) looking for adjacent A-empty squares (horizontally and vertically only). Use D to identify their access corral number. If not in CRAC(NCR) add it. For completeness we can complete CRAC(NCR) but for the purpose of finding circuits (defined as those with len(CRAC(NCR))=1) we can terminate when len(CRAC(NCR))>1. For each circuit, we add one to NC and set CA(NC) (corral access for circuit # NC) to be the single access number in CRAC(NCR). By this means we identify the access circuits. We are usually interested in those for a fixed access number (usually the man-access corral) so we can iterate over NCR to find these.


RFS (Reverse free space) algorithm


Produces a (heuristically) decreasing accessibility-ordered, then lexicographically decreasing depth-ordered, arrangement of extra boxes BL in free space (no increase to the number of corrals).

Step 1


Let DL be a list of squares in P containing boxes on goals which can be pulled at least one square (without forward-relative detectable deadlock). Call IADOL with D=0, unassigned DC, TL a copy of DL. Returns a maximal depth D, increasing depth-ordered list TL and depth corral DC (which for each key S in TL has value the depth of S). Let Q be a copy of P which may vary. Compute the pull access squares for all the boxes in DL. replace TL with an equivalently-ordered sublist containing just those squares belonging to at least one of the pull accesses for the boxes in DL. Idea: aim to pull out the DL boxes as far as we can, but the squares we fix to pull them out to must be pull-accessible, to at least one box.

Step 2


Start with a list of A-empty squares L, and null modified total list MTL of squares (from TL). Iterate through boxes B in DL in a fixed order (assumed not to matter). Compute the pull access for B in Q (assuming Q is not P, otherwise we already did it). Iterate through squares S in TL in reverse order (starting at maximal depth). If S is not in B, skip to next S. Otherwise, if S is not isolated in Q, add S to MTL and skip to next S. Otherwise add a box at S to Q, add S to L and skip to next S. Result: Q stores a position with a number len(L) of extra boxes, each of which is surrounded by empty space, whose squares belong to L. Idea: we've found a depth-ordered arrangement of isolated boxes in free space; as boxes are isolated we can't have increased the number of corrals, though we'll have increased the number of circuits by len(L). This arrangement is particularly accessible as the boxes are isolated.

Step 3


Replace TL by MTL (to look for more box arrangements with more access restrictions). Let BL be a copy of L. Let L be a null list of squares, similarly MTL. Compute the pull access for B in Q. Iterate through squares S in TL in reverse order (starting at maximal depth). If adding a box to S in Q would create a frozen-box deadlock (disregarding the fact that there may be more boxes than goals!) or add to the number of corrals rel. to P, skip to next S. Otherwise, if adding a box to S in Q would decrease the number of circuits rel. to the previous value of Q, add S to MTL and skip to next S. Otherwise add a box at S to Q, add S to L and skip to next S. Result: Q stores a position with a number len(L) of extra boxes, compared to the value of Q at Step 2. Adding these boxes created no frozen-box deadlocks and didn't increase the number of corrals or decrease the number of circuits. The arrangement is depth-ordered and only marginally less accessible than the one at Step 2; we may have adjacent boxes or boxes adjacent to walls.

Step 4


Replace TL by MTL (to look for more box arrangements with more access restrictions). Let BL be a copy of L plus the previous BL. Let L be a null list of squares, similarly MTL. Compute the pull access for B in Q. Iterate through squares S in TL in reverse order (starting at maximal depth). If adding a box to S in Q would create a frozen-box deadlock or add to the number of corrals rel. to P, skip to next S. Otherwise add a box at S to Q, add S to L and skip to next S. Result: Q stores a position with a number len(L) of extra boxes, compared to the value of Q at Step 3. Adding these boxes created no frozen-box deadlocks and didn't increase the number of corrals. The arrangement is depth-ordered and possibly less accessible than the one at Step 2; we may have reduced the number of circuits (perhaps to zero). The algorithm terminates after this step, with the last L added to BL.

Step 5


Repeat Steps 1-4 with external corrals, if they exist, moving the man into each external corral in turn, and calculating free space relative to pullable boxes in each corral.

The next stage for reverse spreadout is to find an order of accessible pull routes pulling out the DL boxes to (some of) the assigned free space squares in RFS. In other words, a reverse variation moving the DL boxes into free space. There may be more or less free space squares from RFS than there are boxes in DL. If more, we bias towards the first len(DL) of the free squares (those with the highest depth and, heuristically, accessibility), holding the remaining free squares in reserve for shuffling manoeuvres. If less, we obviously can only pull out len(BL) boxes at best. However, in the process of doing this, we may open up access to boxes which may have more access to free space in the new arrangement. For example, there might be two corrals in a solved position, and let's suppose all the goals in one room. From one corral (assumed to be one for which the puzzle is reverse soluble) we may not have free space room in the corral to pull out all the N boxes. However, we may have free space enough to pull out enough boxes to make the puzzle have a single corral, which then increases the free space access to the squares which were initially in the second corral, allowing the puzzle to be solved.

RPR (Reverse pull routes, also known as reverse spreadout) algorithm


This is the main algorithm intended for use in the reverse search, for pulling boxes out to a "spread out" position with a single corral and for establishing a stacking order.

Step 1


Start with null list of pull routes V. null list SBL (static box list) and a solved position P (copied to Q which may vary) with pullable boxes PB and apply RFS to get an ordered free square arrangement BL within the given corral. In any (arbitrary) order iterate over the pullable boxes B in PB (note that PB will dynamically vary rather than being fixed; it may increase though never decrease in size). Look for the first square S in BL which belongs to the pull access for B. If there is no such square S, despite there being empty squares in BL, call algorithm RSB (below) to see if we can rearrange previous boxes in free space to find an S that works. If still no S works, skip to the next B, adding B to list SBL. Otherwise compute a pull route to S and add it to V.

Step 2


When we find a pull route to add to V, and change Q by moving the box B along this pull route to free space, we must check whether any new boxes are now pullable (in addition to those in PB) as a result of this move. If so, we add this list of new pullable boxes onto the front of PB (heuristically considering it more important to pull out boxes that were previously inaccessible). We check whether the number of corrals in Q has decreased compared to P. If so, we must first check if there is only one corral and all boxes not moved yet are not (push-)frozen on goals. If so, we terminate the algorithm (returning True for success, V and SBL.). Otherwise, we skip to the next B in Step 1.

Step 3


Eventually we reach the end of the (dynamically varying) PB. So every box has either been moved to a free square or, with no such move being found, had its square added to SBL. Since we didn't already terminate, we must have failed (either there is more than one corral or some boxes are (push-)frozen on goals). So we terminate the algorithm and return False for failure, along with V and SBL to see how far we got. In future, this output could be used to feed other algorithms which try to learn from the partial success achieved (measured in V) and get round the problems represented by SBL. For example, in the two-corral puzzle mentioned above, if free space in the initial corral is limited, it might be necessary to leave unmoved several boxes which could be pulled, and prioritise pulling out previously inaccessible boxes in some special order, so as to fill up the available free space in the first corral at just the moment when access to the second corral is granted with the last box pulled out. A tree search algorithm, scoring previously inaccessible boxes higher, might work - details later.

Note: more generally, we could signal different degrees of success and failure. For example, if we found a position with a single corral, but some of the boxes BL are still push-frozen on goals, (yet some progress has been made in pulling out boxes and/or reducing corrals) we could set this as the aim of the forward search, with the proviso that we may finally stack any box on the goals in BL at any point, but no other boxes are allowed to get frozen on goals not in BL. This generalises the notion of deadlock to take into account stacking order restrictions. The forward search then aims for a position with one corral the only frozen goals in which belong to BL. There would then be an intermediate stage matching up the forward and reverse search positions, either by stacking more boxes onto goals in BL or moving a box from a square in the forward search to one of the squares having boxes in the reverse search. See later for algorithmic details.

RSB (Reverse shuffle boxes) algorithm


We are given a position P, with given man corral, a free space arrangement BL, some of whose squares FL are already occupied by boxes, and a box B which currently can't be pulled to a square in BL despite there being empty squares in BL. Our aim is to rearrange the boxes in FL to squares in BL so that B can be pulled onto an empty square in BL. If so we return True for success and the variation rearranging the boxes and moving B.

Step 1


Iterate over squares S in FL (currently covered by boxes in P). Copy P to Q, and delete the box on S. Compute the pull access for B within the given man corral in Q. If it now contains an empty square in BL (including the newly-empty one S), make a pull route to the first accessible square in BL, call it V, and go to step 3.

Step 2


Skip to the next S. If we run through all S in FL, terminate returning False for failure and a null variation.

Step 3


Make a null list of pull routes W. Iterate over empty squares E in BL other than the destination S2 of V. Let R be a copy of P. Delete from R all the boxes in FL except for S2, where we add a box. If there is a pull route PR from S2 to E in R, make a list L of squares in FL which intersect with PR (here meaning just the squares the box is pulled over, ignoring the man moves), in order of increasing size of the variation to that square in PR, and go to step 4. Otherwise go to step 2.

Step 4


Let P2 be a copy of P. Iterate through squares SB in L in reverse order; for the first one, try to find a pull route to E in P2. If not, go to step 2. If successful, add this pull route to the front of W, and move the box in P2 from the start to the end of this pull route. Now replace E by SB and skip to the next SB in L in reverse order. If we were successful at every SB in L (otherwise must have returned to step 2) our W contains a sequence of pull routes shuffling the boxes in L to free space. Add the pull route V to the front of W. Notice that the position of P2 and P match in terms of V applied to P giving the (last) position P2. Terminate returning True for success and W.

Note: RPR is intended as a quick-and-dirty algorithm which will terminate after a few seconds with modest memory overhead for a medium (and even quite large) puzzle. It finds the first free space arrangement it can (using a distance heuristic, doesn't look for alternatives), moves boxes only by single-box pull routes direct from goal to free space (no intermediate stages and, except for shuffling, the boxes on free space don't move again) and pulls out boxes in an arbitrary and fixed order (doesn't look for alternative orders). It does some shuffling when no direct pull route is found, moving around the free space boxes it has already pulled out to make space for another box, but this shuffling is also limited to single-box pull routes between free space squares and an arbitrary fixed order of boxes and squares found from which ones intersect a (phantom) pull route with free space boxes deleted. It does try potentially several phantom pull routes, but sequentially rather than in a tree, and keeps only the first one for which shuffling was successful. RPR calls RFS and RSB. RFS also calls IADOL.

Such an algorithm will have extremely variable results. It is intended as the first stage of a reverse search, to identify "easy progress" towards the ideal reverse spreadout position with a single corral and no push-frozen boxes on goals. I envisage it being followed up by more lengthy searches to address its shortcomings. For example, a tree search which allows the SBL boxes to move by individual pulls one square at a time, (or not at all except via pull routes direct to free squares if this can reduce len(SBL); a subsequent search could allow boxes in SBL to move by individual pulls once we've minimized len(SBL)) other boxes to move only by single-box pull routes direct to the free space identified by RPR, (only to the first accessible square in the ordered list) and then not to move again with the exception of RPR shuffling. Such a tree search could terminate after a fixed overhead of time and memory. The idea is to make "easy progress" in between more exhaustive searches at the sticking points.

More precisely (sketch of algorithm RPRT, RPR with tree search): (a) RPR produces variations moving goal boxes to free space, possibly via shuffling, returning a list SBL of unmoved boxes. (b) tree search using RPR pull routes direct to free space or via shuffling scores a positive constant times the sum over boxes on free space multiplied by pull layer depth of the box moved there (discounting intermediate shuffling variations - we'd need to keep track of where each box came from by numbering boxes within layers) (pull layer depth is the # of layers of boxes initially pullable from the given man access which have to be iteratively deleted to give access to successive layers of pullable boxes.) SBL boxes which are push-frozen score minus pull layer depth times a positive constant. External corrals score a large negative constant. This may decrease the # of push-frozen boxes in SBL and the number of external corrals. (c) tree search as in (b) but remaining push-frozen SBL boxes (only) get single pulls. This may further decrease the # of push-frozen boxes in SBL and the number of external corrals. (d) tree search as in (c) but all remaining SBL boxes (only) get single pulls (in case the non-frozen SBL boxes need to be manoeuvred out of the way with single pulls to permit more frozen SBL boxes to reach free space). This may further decrease the # of push-frozen boxes in SBL and the number of external corrals. Note that the rationale behind this sequence is to limit exhaustive search (single pulls) by using the smallest # of push-frozen boxes in SBL obtained in the previous steps.

Note that if RPR combined with the tree search(es) is successful, it provides a stacking ordering (actually, with variations and shuffling, even more information than this). This ordering information can then be used to generalise deadlock detection in the forward search (see the previous section) and prune stacking moves that don't correspond to the given order. Even when RPR-plus-trees fails, it may provide a partial stacking order: the goals in SBL can be stacked in any order first, and then the remaining goals in the determined fixed order.

Forward search


Analogous algorithms to RPR and tree searches can be applied to the forward search. Take the pushable boxes with an accessible push square in SP to be the push analogue of PB. The initial aim of forward search is to open up the position to a single corral. The tree search score heuristic could value initially inaccessible boxes higher, as with the reverse search. A large fixed score penalty for each external corral would favour the aim. The way corrals are divided up into free space may be a little different since we would have the free space in a corral from the corral's pushable boxes accessible within the corral, but we'd also have free space from pushing in boundary boxes from a neighbouring corral (the latter combined for all neighbours is the usual reverse free space for the corral).

Relocation subtasks - reversible region digraphs


Assuming the reverse search succeeds in producing a single corral and a complete stacking order, and the forward search succeeds in producing a single corral, we next apply the dynamic reversible digraphs for each stacking order position which results in frozen (or possibly corral deadlock frozen) boxes on goals. We use the reverse variations to identify how many boxes are in each dynamic reversible region at each stage. We then want to match up the forward and reverse variations as far as these region numbers go. We can use free space, RPR and tree search, with fixed score penalties within each region with box number differing from the aim. We could try both forward and reverse relocation in stages matching up the most progress in either direction.

Relocation subtasks - matching forward and reverse positions


Assuming the dynamic reversible relocation is successful at the stage comparing reverse and forward single corral positions before (final) stacking takes place, the next stage is to exactly match the forward and reverse positions. This could be done with the same relocation algorithms, with box positions for the other search ("box aims") added to free space, and score bonuses for positions with boxes on box aims. Note: the rationale behind separating relocation into two stages is that it is easier to match any arrangement of a # of boxes within a region than to match exact positions, but once the regions have the right #s of boxes it may be possible to rearrange them mainly within regions for an exact match - again we're trying to limit the amount of exhaustive search (single pulls). It is also useful to see how far we got at each stage in case of only partial success - further heuristics and algorithms could be tried at each stage.

More on Relocation subtasks


I envisage relocation being carried out with no external corrals for ease of access. However, the possible existence of circuits could make relocation even easier. I don't have a precise alternative algorithm currently making use of circuits. Relocation can still be carried out with external corrals, but more access restrictions make it less likely to succeed.

One-way doors (definition)


A "static one-way door" (s1-door) in the maze M of a puzzle P consists of empty squares C, O, E such that: (a) O and E are man-adjacent to C at right angles (b) with a box added to M at C there is a push to O (c) with a box added to M at O there is a push to C (reversibility) (d) with a box added to M at E there is a push from C Note that (b), (c) and (d) require further empty squares either side of O (OP), C (CP) and E (EP) with all six of these squares distinct. (e) none of C, O, E, EP are dead squares in P (f) with a box added to M at C and at O, and the man at CP, the man access does not include OP or E. With the man moved to OP, the man access includes E (g) with a box added to M at C and at OP, and the man at CP, the man access includes O but not E (h) with a box added to M at C and at E, and the man at CP, the man access includes O and EP

Note that (h) is possibly redundant, implied by (a)-(g), but included for clarity. The case of a two by three rectangular room with perpendicular exits, suitably arranged, gives rise to a s1-door with E the wide side exit and OP the other exit. C stands for closed, O for open and E for exit. Perhaps there should be a further condition to indicate that trying to traverse the s1-door the wrong way gives deadlock. The problem in general is such a deadlock might not be detected - though it might in the case of 2x3 rectangular rooms.

One-way system (definition)


A "simple static one-way system" (s1-way) W is an ordered list of at least one s1-door(s) such that: (a) if s1-door D'=(C',O',E',CP',OP',EP') follows immediately after s1-door D=(C,O,E,CP,OP,EP) in W, with boxes added to M at O, O' and C and the man at CP, the man access includes OP' but not OP or E', and there is no free space in the man's corral. (b) if D is the first s1-door in W and D' the last then with boxes added to M at each of the s1-door O (open) squares and the man at OP, the man access includes C'. (c) if s1-door D'=(C',O',E',CP',OP',EP') follows immediately after s1-door D=(C,O,E,CP,OP,EP) in W, with boxes added to M at O and O' and the man at CP, the man access includes OP' but not OP or E'. Note that (c) is usually redundant, but included in case, for example, C is a T-intersection. Note that with boxes as in (b) there are len(W) - 1 external corrals inside W and the man can traverse W by pushing each box from O to C (open to closed) and back again in turn, leaving W in the same state. It is possible an s1-way forms a loop, in which case an extra condition (a) applies where we consider the first s1 door to follow the last, but in most cases there will be a well-defined start and end.

Provided the traversing variation is left available, one can for some purposes treat the boxes in W as if they weren't there, leaving a single corral, as regards man access in a puzzle including W with boxes as in (b). See "one-way access", below for a precise description. Note: a s1-way might have some of the O-boxes away from the O-squares (on the C squares for instance). To make things simple, so that W doesn't change state when traversed, we'd first traverse W pushing the O-boxes onto O-squares if this is easy - if not W is not correctly occupied for traversal, and if not deadlocked we'd first have to find variations to correctly occupy W. If W is completely empty, variations to fill it would normally be easy, pushing the last O-box in first, and then using the W traversal squares to push in the other O-boxes in reverse order, only provided the first OP square is accessible. But this requires that W is push-traversable in entirety, which may not be true in general. The definition only requires man-traversability and reversible pushes between the adjacent O and C squares. And W might be partially, excessively or incorrectly occupied before we get a chance to use it, and even in the static case with 2x3 rooms the variations to correctly occupy W may not be obvious, due to the possibility of creating a deadlock due to W-external boxes if trying to push a box out of W. Note: There may be more general one-way systems or traversable corrals utilising more than a single reversible push of one box to move between corrals. There are, for instance, "valves" which function something like s1-doors but require two boxes to move. There can also be traversable corrals which have a different state (due to boxes not moving back to their original positions) after traversal. There can be intertwined (overlapping) one-way systems. None of this complexity is being addressed here. A computer solution of medium to large puzzles featuring even standard 2x3 room 1-ways significantly, many of which will have too large a tree for conventional solvers, I consider would be a significant advance.

Double-box s1-door


If a position P has two boxes occupying an s1-door (such as one on O or C and one on E) it can't be used for one way traversal until one box (like the one on E) is exited. If the s1-door is push-traversable even with two boxes, by push-reversing one box and pushing out the other to free space, simple variations should normally achieve this converting it to an occupied s1-door. If the s1-door is immediately followed by a vacant (in P) s1-door so the two form an s1-way, using the second O-square as free space makes the s1-way occupied and traversable, thus increasing access (see one-way access, below) without increasing free space occupancy, so it would perhaps be worth doing this in RPR(T). One-way access (definition and algorithm)


If WL is a list of s1-doors (belonging to one or more s1-ways) with boxes on the O squares, make a dictionary D with keys OP squares and values EP squares. When calculating man access to push a box outside the s1-doors around a push route, when a square in the D keys (which can be more efficiently checked with another dictionary) is accessed, the value is used to add the EP square to man access - this would be accessible by traversing W. The extended access is "one-way access" (1-access) relative to WL. The access can be calculated twice, first without the WL "teleports" to avoid unnecessary use of WL. The relevant variations to traverse the s1-doors are also stored ready to insert into push routes when needed. Note that it's not important which s1-way we use to achieve access (use an optimiser to find the minimal choice).

Note: s1-doors and s1-ways obviously apply to the reverse mode also, with the order in W reversed and pulls replaced by pushes. Note: normally we'd only use the beginning OP to "teleport" the man to the end OP, but in some more general s1-ways than those with 3x2 rooms connected directly by straight tunnels there might be other boxes than the O ones within W so we might want to move them. It is also possible the definition could be extended to dynamic 1-ways (where some of the access obstacles are boxes rather than walls, though it could be problemmatic to work out the level of occupancy by boxes of the 1-way) where other boxes than the O ones within W being moved might change access restrictions in W and it might be possible to move such boxes between different s1-door corrals. Note: it might seem redundant to use one-way access from a position where W is not "end-blocked" (the first and last OP squares are both accessible). However, consider pushing a box outside W around a push route; the box might end-block W en route, and access to change push direction might then require traversing W. This could even occur with W only containing a single s1-door, and thus not contributing external corrals. In this special case using W actually increases access for push (or pull) routes in the single corral situation envisaged for RPR(T).

Assuming that trying to push out O-boxes from W in a puzzle P directly produces deadlock, a human solver can use the 1-way access to rearrange other boxes (temporarily end-blocking W and traversing W to regain it) until clearing W becomes possible. Thus W functions as an adjunct to free space in terms of computer solving, just as arrangements with doors or multiple non-deadlocked corrals may do. I think W traversals, along with rearranging other boxes and temporarily blocking access to the end of W generalises shuffling, though I haven't got a precise algorithm yet. More work is needed on the details of s1-ways before they can be added to SOLVE. At present they could only be used in the case of single s1-doors or push-traversable s1-ways which are easy to occupy or vacate as free space O-squares in RPR(T). Even then it isn't clear how to modify free space from RFS to take account of detected s1-ways; adding the detected O-squares to free space before applying RFS is one option, if we keep the true free space (within a single corral) separate. We'd have to look at the free space arrangements with and without occupied O-squares to be able to still apply the single-corral condition. Variations to occupy and vacate s1-ways where possible with methods similar to RPR(T) are needed. A restriction to reversibly push-traversable s1-ways would be a start so that only W-exiting deadlocks are a problem: which can be solved with a free space external to W and reversibly push-routable to an end of W. This can generalize to push-traversable s1-ways, with a free space outside W pushable to the first O-square and one pushable from the last O-square.

Dynamic 1-doors, 1-ways and 1-access


We can generalize s1-doors to the dynamic case (d1-doors: arbitrary positions and some access restrictions due to boxes other than O-boxes) by deleting (but keeping track of) boxes on the squares C,O,CP,OP,E and EP before applying the s1-door conditions to the resulting position rather than the maze. Similarly with d1-ways (without additional deletions). We still won't detect all clogged-up d1-ways with extra boxes blocking the access between one d1-door and the next (deleting all boxes outside the two sets of six defining d1-squares won't work in general as we could be deleting access barriers required in the definition, but it will detect some more d1-ways) or even some d1-doors if extra boxes block the access requirements. Where an underlying d1-way is detected only with surplus boxes deleted, it creates a significant problem: how to remove the surplus boxes. Also there is the dilemma of whether it is worthwhile to do so since that would be occupying free space which is currently unused, for the marginal benefit of 1-access. Perhaps a straight generalization without box deletion is best so we can apply the methods previously described (specifically, 1-access for push routes) in the static case. Even so, detection of d1-ways (potentially with four complex checks for each non-wall square in the puzzle) could be a significant overhead we don't want at nodes in the RPR tree searches, so we might want to restrict it to the start or solved positions, if using it at all. Also, boxes determining dynamic access restrictions required for a d1-way could get moved during SOLVE without the program "knowing" - though that just makes things easier so perhaps it can be safely ignored.

More on s1-ways and free space


How can we detect and use s1-ways as an adjunct to free space?

First consider how to identify all s1-ways in the maze M. Loop through A-empty squares S in M not belonging to the Reverse exclusion zone (assuming we're aiming to use s1-ways in reverse mode from the solved position for clarity). Make the four possible checks for orientations of s1-door with C square at S and O square an adjacent pull square to S. That finds all s1-doors, from the definition. Add boxes at the O and E squares of all s1-doors and compute all man access corrals, and determine if there is a free space square in each corral. Applying condition (c) of s1-way definition we can determine if each man access corral connects adjacent s1-doors in an s1-way or not, and the order they come in. Putting together this info we find the s1-ways in M. We also can store the access squares trapped in corrals between succeeding s1-doors in the s1-ways.

Now suppose we have a solved position FP with given man corral and compute all man access corrals in P. In calculating free space within each corral, we normally look at A-empty squares S within merged pull access where addition of a box at S (making position P) isn't a detectable deadlock (ignoring the fact that there is one more box than goals) and, when computing all man access corrals within P, the total # of corrals in FP and P is the same.

To add s1-ways, apply the same conditions. So O-squares only get added if they are individually free squares in the usual sense. The change comes when we look for arrangements of many free squares in RFS. So we start with a position P1 with several free space squares occupied by boxes and are looking to add another. Firstly, the s1-door O squares are put at the front of the queue for new squares, so several of the free space boxes in P1 may be O-squares. We compute all man access corrals in P1 (MA(P1)). But we then compute 1-way access WA in the given man corral, using the s1-door "teleports" belonging to free space where they are applicable (meaning, where we can reversibly pull the box on the O square to the C square within man access starting at C). First we require no more 1-way man access corrals (relative to the given corral) than there were in FP. Second, from MA(P1) we move the man to each external corral EC such that the A-empty squares in EC belong to WA in turn. We then calculate 1-way access from there; the next condition is that this includes an A-empty square from the man corral (so we found our way back to the start).

Note that boxes on s1-door O-squares on the boundary between the given corral and an external corral may have to be treated differently. It's not enough to have one such to merge at the level of one-way access; a trip around s1-ways is needed to lead back to the given corral. Such a trip could involve s1-doors within and on the boundary of multiple corrals. So perhaps we should first calculate 1-way access, using all the detected s1-doors whereever they are, from each external corral adjacent to the given corral, seeing which lead back to the given corral to determine which corrals we can "1-way merge". Hopefully RPR(T) will result in boundary boxes being pulled to O-squares in free space (perhaps from adjacent corrals accessed by an sequence of s1-doors), setting up such mergers. So RPR(T) would need to check 1-way access from the end of its pull routes, in case these end in an external corral due to s1 "teleports", and in case the pull route destination is an s1-door O-square.

Future development and problems


The biggest weakness with RPR is probably the arrangements of free squares found by RFS. Better arrangements, packing more boxes into free space, may well exist than are found, and for more difficult puzzles they may be essential to the solution. I have no workable generally-applicable alternatives at present. The next problem is that RPR relies on free squares in the first place; more difficult puzzles may require navigable positions with multiple corrals. A step towards dealing with this extra complexity is to identify and use static one-way systems - those inherent in the arrangement of rooms in the maze, as discussed earlier.

This brings me to another problem with the RFS free arrangements: in the last stage, where circuits are removed, there is an awful lot of choice of ways to do this (much of which might be relatively unimportant in terms of the number of boxes packed into free space). However, some of the choices (due to convenient ordering) are likely to facilitate shuffling better than others, since the free space squares which cut down the circuits are introducing steadily more access restrictions. Again I have no workable alternatives at present, other than the obvious: restrict the number of candidates for free squares to one within straight tunnel sections.

Another way to introduce extra corrals is to generalize the notion of free squares to include "doors" for each extra corral. When checking a square for the free square list, usually we skip it if it increases the number of corrals. But we could check if it creates just one more corral and the square the box was pulled from initially still belongs to the man access, and we have access to the opposite pull square from the last one used in the pull route. We could allow only one such "door" per extra corral, or allow an unlimited number (perhaps setting up significant problems in later matching the reverse and forward search). The idea is that we block off an area by closing the "door", but it is possible to open the door by pulling in the opposite direction later to make the arrangement accessible again. Perhaps this idea should be combined with reversibility : we require the door box to be pullable back to the goal square it started from (at the time of setting up the door, and regardless of which pull square we use to return it beyond it being an accessible one).

A third way is to introduce some exhaustive searching for small rooms where we calculate all the different arrangements. We'd have to check the arrangements were achievable somehow, perhaps by successively peeling off "doors" until we had a single corral, and then put the "best" achievable arrangement (most boxes) into RPR.

A possible way to improve free space arrangements is via "deformation" of corrals. Consider boxes on the boundary of the given man corral with a push square in a neighbouring external corral. If pushing such a box away from the man corral is possible without increasing the # of corrals (or a detectable deadlock) we can do this to open up more space in the man corral. We can do this for all suitable boxes on the boundary. Of course, this extra space comes at the cost of restricting free space in neighbouring corrals. Suppose we've applied RPRT in the forward direction and only have partial success in occupying free space and reducing external corrals. Wherever we had the most success in occupying free space, we try deforming outward (pushing) those corral(s) at the cost of neighbours with less occupied free space. I don't have a detailed algorithm but envisage adding deformation pushes as an intermediate between push routes to free squares and exhaustive search. Hopefully we can add to free space in the successful corral(s) and again hopefully that will increase the occupied free space.

Another problem with the the RPR(T) free space is it is calculated at one time for a single position (any solved position in reverse and SP in forward). As variations are applied occupying some of these a priori free space squares, the position will change dramatically with respect to the SBL boxes, some of which may have moved by a sequence of single pulls. This may open up new dynamic free space squares not calculated a priori. For example, in a classic puzzle with many goals packed tightly into a single goal area, pulling out a number of boxes might open previously occupied space in the goal area, which is unnoticed by RPR(T), but might be needed for a solution. We could simply tack on a new free space calculation at the end of RPR(T) and run it again at the end of the previous variation to utilise the goal area space. That may work in some cases but doesn't address the issue of dynamic free space squares opening up fleetingly at any stage of the reverse variation, which may need to be used in more complex puzzles. Trying to deal with this in general by adding all dynamic free space squares in with separate subtasks risks subtasks with partial success multiplying without limit and defeating the purpose of limited search.

Note that the search is in the following fixed sequence (sketch of algorithm SOLVE): (a) open up to single corral with both forward and reverse variations. (b) apply reverse stacking order to find dynamic reversible region box # restrictions and relocate to match #s. (c) relocate to match forward and reverse variations into a solution. Obviously not all puzzles will divide up into this sequence neatly. Apart from one-way systems requiring external corrals, in practice the puzzle may require partial opening up, relocation, closing up, stacking and other variations in arbitrary sequence. This risks subtasks with partial success multiplying without limit and defeating the purpose of limited search.

Another problem is unmatchable forward and reverse variations (which may be related to more subtle dynamic restrictions than I have described). For example, as a human solver and setter I've often found forward and reverse variations which match exactly, except either the man is in the wrong corral or just one box is displaced one square, but requires a pull rather than a push in forward mode to match. Sometimes a little more work to the end of either forward or reverse variation produces a solution but often a completely new approach is necessary, suggesting the forward and reverse variations may be "unmatchable". Perhaps some form of machine learning is required to deal with more difficult puzzles, or perhaps progress can be made with a relatively simple parallel search of multiple subtasks with some connecting heuristics.

Personal tools