DHolland S3
From Sokoban Wiki
Puzl bustr (Talk | contribs) (→Section 3 Algorithms: d1-door definition final revision. Example explained) |
Puzl bustr (Talk | contribs) (→Reversible regions, sources and sinks: added subsection on Flow deadlock detection) |
||
Line 45: | Line 45: | ||
These ideas of reversibility and digraphs to show directions for pushing are "static" in that they apply to one-box mazes though the numerical restrictions they imply can be applied to "dynamic" positions. Further dynamic restrictions could be applied once a stacking order is obtained; frozen boxes during stacking could subdivide the static reversible regions, where the frozen boxes are added to the one-box maze to find further restrictions. Such dynamic restrictions could signal "relocation" subtasks - moving a certain number of boxes from one dynamic region to another before a particular phase of the stacking is completed. This would be of great importance for a limited search as the success of the reverse search in finding a stacking order would then automatically generate these relocation subtasks, potentially enabling the solution of puzzles with too much complexity for conventional search. More on relocation later. | These ideas of reversibility and digraphs to show directions for pushing are "static" in that they apply to one-box mazes though the numerical restrictions they imply can be applied to "dynamic" positions. Further dynamic restrictions could be applied once a stacking order is obtained; frozen boxes during stacking could subdivide the static reversible regions, where the frozen boxes are added to the one-box maze to find further restrictions. Such dynamic restrictions could signal "relocation" subtasks - moving a certain number of boxes from one dynamic region to another before a particular phase of the stacking is completed. This would be of great importance for a limited search as the success of the reverse search in finding a stacking order would then automatically generate these relocation subtasks, potentially enabling the solution of puzzles with too much complexity for conventional search. More on relocation later. | ||
+ | |||
+ | ==Flow deadlock detection== | ||
+ | Filling a goal trap with more boxes than goals is an overflow deadlock. If we consider the area of the puzzle outside the goal trap, this area then has fewer boxes than goals so is an underflow deadlock. The (static and dynamic) reversible region digraphs let us generalise these flow deadlocks (as far is possible using static and dynamic OBM access restrictions) and detect them in the search as follows. | ||
+ | |||
+ | For each reversible (for short, flow) region give the digraph vertex a weight equal to the number of boxes (in a search position) minus the number of goals. We can ignore the vertices with zero weight: there will be a flow deadlock if and only if there is no way to share out positive weights amongst negative weight vertices with paths in the digraph from the positive to negative weight vertices following the arrows, so that every vertex gets new weight zero. | ||
+ | |||
+ | Trivially, there will be no flow deadlock if every vertex has zero weight. We can also detect trivial flow deadlocks using disconnectedness, but we need to set up a new digraph which condenses paths to single edges as follows. | ||
+ | |||
+ | The new digraph DB has a vertex for each vertex of the old digraph D with either positive or negative weight (inheriting the weight). The only edges in DB start at positive weight vertices P and end at negative weight vertices N. There is an edge in DB from p to n if and only if there is a path in D from p to n following the arrows. Equivalently, for computational purposes, if there is a push-route in the one-box maze from a square of P to a square of N (in the dynamic case we add the goal frozen boxes to the maze and then replace them with walls to get a "dynamic maze"). There is a flow deadlock in D if and only if there is no way to share out positive weights amongst adjacent negative weight vertices in DB so that all vertices have new weight zero. | ||
+ | |||
+ | If DB has a positive weight vertex with no outgoing edges, that is a trivial overflow deadlock (a goal trap). Similarly, if DB has a negative weight vertex with no incoming edges, that is a trivial underflow deadlock. These are simple cases when (the underlying graph of) DB is disconnected. A search should at least detect these trivial deadlocks. | ||
+ | |||
+ | Treating DB as a graph (mark an edge from W to V whenever there is an arrow from V to W) we can determine the connected components (start with a vertex and add adjacent vertices, then those adjacent to those, and so on). This lets us determine more trivial flow deadlocks if, for any connected component, the sum of all weights is non-zero (as could happen for a soluble puzzle through goal freezing). More generally, DB has a flow deadlock if and only if at least one of its connected components does. | ||
+ | |||
+ | Other trivial flow deadlocks can be detected if a vertex of positive weight P has sum of negative adjacent weights Q where P+Q>0 (overflow) or the analogue for a vertex of negative weight (underflow). However in general it is possible for a flow deadlock in DB to exist without being trivial. | ||
+ | |||
+ | Look at this example puzzle. | ||
+ | |||
+ | <pre> | ||
+ | ##### | ||
+ | # @ ###### | ||
+ | # $ # | ||
+ | ## ##### # ##### | ||
+ | ##### # ## ### # | ||
+ | # #$ # # # ## | ||
+ | # $ $ .. $ # # | ||
+ | ## ##### ## ### .# | ||
+ | # #### # ## ## # | ||
+ | # . # #### | ||
+ | ############## . # | ||
+ | ##### | ||
+ | </pre> | ||
+ | It is easily solved. However, by freeze-stacking we can reach this position: | ||
+ | <pre> | ||
+ | ##### | ||
+ | # @ ###### | ||
+ | # $ # | ||
+ | ## ##### # ##### | ||
+ | ##### # ## ### # | ||
+ | # # # # # ## | ||
+ | # $ ** $ # # | ||
+ | ## ##### ## ### .# | ||
+ | # #### # ## ## # | ||
+ | # . # #### | ||
+ | ############## . # | ||
+ | ##### | ||
+ | </pre> | ||
+ | This is a nontrivial flow deadlock. The box to the right can reach any empty goal, in particular the rightmost two, but whichever of the rightmost two it stacks on the other one can't then be reached by a box (underflow). Contrariwise with the leftmost empty goal and the leftmost two boxes (overflow). DB in this case has six vertices: three of weight one (regions containing the boxes not on goals) and three of weight minus one (regions containing the empty goals). DB is connected, bipartite, all weights sum to zero, every box can reach some goal and every goal can be reached by some box. | ||
+ | |||
+ | Let us return to the general case. Excluding the trivial case when DB has no vertices (as D has only zero weights), we can suppose that DB is a nontrivial, connected bipartite digraph (hence the B in DB: the two parts being the positive and negative weight vertices), whose vertex weights are either positive integers or negative integers (no zero weights), whose directed edges go only from positive weight vertices to negative weight vertices and whose sum of vertex weights is zero. Also, every vertex of positive weight P in DB has the sum Q of adjacent vertex (negative) weights such that P+Q<=0, and analogously for negative weight vertices. | ||
+ | |||
+ | Now form a new digraph DB2 by replacing each vertex, of weight W, say, by abs(W) new vertices of weight sign(W) (we can use some of the box or goal squares in the flow region) with edges in DB2 inherited from DB in the obvious way. In DB2 there is a flow deadlock if and only if a maximal matching has cardinality less than the sum of positive weights, and we're back to bipartite matching (though possibly with reduced cardinality, i.e. whenever a region contains both boxes and goals). | ||
+ | |||
+ | Note that a conventional approach to flow deadlock detection (as I understand it! I may be in error) is to use maximal matching in bipartite graphs, for example with the Hopcroft–Karp algorithm or more likely an optimised variant. Here the graph has vertices the union of the box and goal squares, and an edge between a box and goal if and only if there is a push-route in the one-box maze from the box to the goal. This graph is bipartite and a maximal cardinality matching assigns as many boxes uniquely to goals they can reach as possible - there is a flow deadlock if and only if the cardinality is less than the # of goals n. It takes n^2 push-routes to set up the graph (which reduces to n or less if we store results of previously-calculated OBM push-routes assuming only one box is moved from one node to any successor) and Hopcroft–Karp takes O(n^2.5) worst case to find a maximal matching. Thus the matching algorithm is likely dominated by the push-route computations (even with n push-routes, it would be O(n^3) or greater assuming push access calculation dominates man access calculation which is O(n^2)). | ||
+ | |||
+ | Compare this with the flow region digraph method indicated above. For example, most puzzles in the XSokoban 90 have a single static flow region containing boxes and goals, which can be computed offline. For most positions even when there is frozen stacking the digraphs DB and DB2 are null and no further work has to be done to see these positions have no flow deadlock. One need only calculate DB2 and check for dynamic flow deadlock when the frozen-stacking arrangement changes, or the # of boxes within dynamic flow regions changes - otherwise one can look up a previously computed answer. Even puzzles with complex stacking, likely to produce flow deadlocks, will change flow region box numbers rarely. So the flow region digraph method has potential to be faster, though one needs to consider the complexity involved in computing dynamic flow regions and DB2 (likely to dominate the bipartite matching) and the memory used in storing answers. Roughly speaking, the approach using e.g. Hopcroft–Karp is treating the boxes and goals as points and calculating OBM paths between them, and the flow region digraph method works with whole regions and OBM paths between them. | ||
+ | |||
+ | At first glance, computing flow regions looks like O(n^4), with n^2 empty squares and n^2 man access/push access/reversible push access (though we can optimise by only computing regions containing boxes or goals and using empty space in the OBM to shortcut man access by sending the man around the perimeter of the box). To prevent this high cost torpedoing this approach, it may be necessary to precalculate flow regions where possible. For instance, a packing order/reverse search could let us precalculate the dynamic flow regions associated with the stack-freezing arrangements occurring in the (completely ordered) packing order relative to a forward search using this order. We'd have to then not expand out-of-order stack-freezing to avoid dynamic flow region calculation in the forward search. | ||
==Note for conventional searches and Sokoban programs== | ==Note for conventional searches and Sokoban programs== |
Revision as of 12:32, 5 May 2010
Section 3 Algorithms
In this section I attempt to make precise (programmable) some of the ideas in the previous sections.
Suppose that there are N goals. Let P be a Sokoban position.
Isolated square (definition)
A square in P is "isolated" if the eight adjacent squares (horizontally, vertically and diagonally) are all empty.
A-empty (definition)
A square is P is "A-empty" (access empty) if it is either empty (possibly containing a goal but no box) or contains the man. Squares that are not A-empty contain either boxes or walls (possibly boxes on goals).
Man access corral (definition)
A dictionary with keys squares S belonging to P and values -1 (if the man has no access to S) or 1 (if the man does have access to S). Can be implemented iteratively or recursively, starting with the man's square and moving to adjacent squares horizontally and vertically. More generally, external corrals could enter into the dictionary with values of integers above 1 (one for each corral). The A-empty squares in a corral are called the "corral access".
BD: Implementation-wise, it wastes time doing it this way, where the calculation is required to initialize all squares on the board with "-1" (or better with "0") before the flood-filling calculation of the reachable squares. Instead, it's better to use timestamps. This requires that the area reserved for the calculation is global. Each invocation of the calculation increases the timestamp value, and then all the reachable squares are flagged with that value, thereby making the reachable squares distinguishable from the rest. That way, the squares in the global area only need to be initialized to "0" before the first calculation and when the timestamp overflows
One-box maze (definition)
The maze M was defined in Section 2 (delete all the boxes from P to get M). For a square S of P which does not contain a wall, the one-box maze for S is the Sokoban position OBM(S) obtained by adding to M a box on square S.
Dead squares (definition and algorithm)
These are the A-empty squares from which a box can never be pushed to a goal. They are found by iterating over A-empty squares S, forming the one-box maze OBM(S), and computing the push access PA(S,PS) for each push square PS of S in OBM(S). If none of the PA(S,PS) contain goals, S is a dead square. This algorithm and concept is well-known but is reproduced for familiarity of the concepts in other algorithms below.
Frozen boxes
A box B such that: (a) no pushes of B are possible due to blocked push squares (at least one of the left and right squares is blocked by a box or wall, and at least one of the up and down push squares is blocked) and (b) Each box adjacent to B on a push square such that the opposite push square is not a wall satisfies (a) and (b) in place of B is "frozen". Evidently, frozen boxes can never be pushed again and a frozen box not on a goal is a deadlock. This is the familiar, recursive definition (apologies if I oversimplified it), and generalizing is problematic for boxes with pushes since pushes which currently lead to deadlock might not do so after other boxes have moved. Even so, here is a generalization. A box B is "corral deadlock frozen" if either it is frozen or every push P of B satisfies either: (c) After deleting all known frozen or corral deadlock frozen boxes, except B, P is a detectable deadlock, or (d) B is a boundary box of an external corral C. With all other boxes outside C removed, the push square to perform P is man-accessible and after P, the position is a detectable deadlock. Moreover each of the other boundary boxes of C satisfies (d) in place of B. Note that the definition is iterative as we start with frozen boxes and may uncover some corral deadlock frozen boxes with each iteration, so we may need to check each box several times, each time the number of deadlock frozen boxes increases. As well as extending corral deadlock detection one ply, it also applies to frozen corrals with all boxes on goals, which is not a deadlock, but may enable detection of deadlocks as other boxes are pushed adjacent to the corral.
The definition of frozen extends to reverse mode in an obvious way. Pull frozen arrangements often have boxes two pull squares apart and are often a feature (obstacle) in stacking problems. Corral deadlock detection also extends to reverse mode in an obvious way, with the man inside the corral unable to pull his way out. With goals moved to the start position one must exclude from deadlock arrangements with all boxes on goals exactly as for the forward mode. Further note regarding frozen and corral deadlocks that straight tunnel sections can be treated as one square long to generalize.
Reverse exclusion zone and push-reachable zone (definition and algorithm)
In trying to solve a puzzle in reverse by pulling boxes, we must avoid dead squares, since a position reached in reverse with a box on a dead square can never be reached from the start position by pushing boxes (since we exclude the detectable deadlock of pushing a box onto a dead square). In other words, allowing this would prevent the forward and reverse searches from being able to meet. We can generalize this idea to the "reverse exclusion zone". In addition to the dead squares, the reverse exclusion zone is defined to include all A-empty squares S (in the start position SP) such that S does not belong to the SP "push-reachable zone". The algorithm for determining the push-reachable zone PRZ for SP is as follows. Let PRZ be a null list of squares initially. Iterate over the squares B in SP containing boxes. Let OBM(B) be the one-box maze for B. Compute the push access PA(B,PS) for each push square PS of B in OBM(B). For each square S2 in PA(B,PS), if S2 is not in PRZ add S2 to PRZ. Terminate returning PRZ when we've finished all the iterations. A typical example of a square in the reverse exclusion zone which is not a dead square: consider a T-intersection whose single exit (not the paired exits in a line) turns into a bent tunnel which has another exit. Assume the first square S in the bent tunnel adjacent to the T-intersection is A-empty in the start position SP, and that the T-intersection and adjacent section of the bent tunnel (including the corner) contain no goals. Because of the T-intersection, it is impossible to push a box onto S (a wall is on the push square). Yet, in a suitable OBM it is possible to pull a box from the T-intersection onto S, and because of the bent tunnel's other exit also possible to pull a box from elsewhere onto the T-intersection and change the man position ready to pull it into the bent tunnel. So S belongs to the reverse exclusion zone without (necessarily) being a dead square.
Goal trap (definition and algorithm)
A trivial example of a goal trap is a goal in the corner of a room: once a box is pushed onto the goal it is frozen ("trapped"). The notion of goal trap generalizes this idea to the case where the box is not frozen, but cannot reach all of the push-reachable zone PRZ for the start position SP. To determine the goal traps, iterate over goals G in SP. Form the push access PA(G,PS) for each push square PS of G with a push from G in the one-box maze OBM(G). Merge these push-accesses for G into a list GT(G) (the goal trap). Whenever GT(G) is just PRZ it conveys no further information. However, if GT(G) contains fewer than N goals (say N2 goals), this means that any position with more than N2 boxes on squares in GT(G) is a deadlock, since only N2 of these boxes can ever be stacked (on the N2 goals in GT(G)); the remaining extra boxes cannot reach any of the remaining goals since they can only reach squares in GT(G). A typical and simple example would be a goal on the wall of a room where that wall contains no exits and meets directly with two corners of the room (so the whole wall is entirely horizontal or entirely vertical). It is okay to push one box onto this wall, as it could later be stacked on the goal. But it is no good to push two boxes onto this wall, since only one can be stacked on the wall's goal, and the other is then stranded in the goal trap and cannot reach any of the other goals in the puzzle. Thus, the numerical information conveyed by the goal traps is the number of goals they contain, and the implied restriction on how many boxes can be in the goal trap at any time without causing detectable deadlock. Note that corner goals have empty goal trap, conveying no information, but goal traps in general can exist with higher capacity but without containing all N goals.
The goal traps may also convey restrictions for the reverse search. If there are NB (more than zero) boxes inside a goal trap in the start position SP, the reverse search cannot be allowed to remove from the goal trap so many boxes that there are fewer than NB boxes in the trap, otherwise the forward and reverse searches cannot meet. As an example, with the goal on a wall as described above, if the start position has a box on the wall the reverse search cannot be allowed to remove all boxes from the wall.
Reversible regions, sources and sinks
The notion of a directed graph (digraph) of push-reversible regions, with numerical restrictions on the number of boxes within each region, generalizes goal traps. The push-reversible regions are got by restricting push access in one-box mazes to squares to those with a push route back to the starting square. This provides non-intersecting reversible regions which contain all A-empty squares other than dead squares. The digraph is obtained by having a vertex for each reversible region and an arrow from region A to region B if and only if there is a push-route in a one-box maze from (any square of, taken to be the first by reversibility) A to B. A "sink" in a digraph is a vertex with no outgoing arrows (a "source" has no incoming arrows). Note that all sink regions must contain goals to avoid being dead squares. Thus if a sink region contains M goals, any position with more than M boxes in that region is a deadlock, generalizing the forward restriction in goal traps. More generally, for each sink S, count up the goals G in regions which have a route following the arrows in the digraph (hence a push route) to S (including S). Count up the boxes B in these regions. If B is less than G (as could happen if boxes are pushed to a region with no push route to S) the position is a deadlock. It may not be a deadlock if B exceeds G, provided G is less than the total # of goals N, because the extra B-G boxes might be pushed to other goal regions with no push route to S. Similarly, for a source S, if it has more goals than boxes the position is deadlocked. Perhaps the reverse search can have generalized restrictions - not sure what they should be.
These ideas of reversibility and digraphs to show directions for pushing are "static" in that they apply to one-box mazes though the numerical restrictions they imply can be applied to "dynamic" positions. Further dynamic restrictions could be applied once a stacking order is obtained; frozen boxes during stacking could subdivide the static reversible regions, where the frozen boxes are added to the one-box maze to find further restrictions. Such dynamic restrictions could signal "relocation" subtasks - moving a certain number of boxes from one dynamic region to another before a particular phase of the stacking is completed. This would be of great importance for a limited search as the success of the reverse search in finding a stacking order would then automatically generate these relocation subtasks, potentially enabling the solution of puzzles with too much complexity for conventional search. More on relocation later.
Flow deadlock detection
Filling a goal trap with more boxes than goals is an overflow deadlock. If we consider the area of the puzzle outside the goal trap, this area then has fewer boxes than goals so is an underflow deadlock. The (static and dynamic) reversible region digraphs let us generalise these flow deadlocks (as far is possible using static and dynamic OBM access restrictions) and detect them in the search as follows.
For each reversible (for short, flow) region give the digraph vertex a weight equal to the number of boxes (in a search position) minus the number of goals. We can ignore the vertices with zero weight: there will be a flow deadlock if and only if there is no way to share out positive weights amongst negative weight vertices with paths in the digraph from the positive to negative weight vertices following the arrows, so that every vertex gets new weight zero.
Trivially, there will be no flow deadlock if every vertex has zero weight. We can also detect trivial flow deadlocks using disconnectedness, but we need to set up a new digraph which condenses paths to single edges as follows.
The new digraph DB has a vertex for each vertex of the old digraph D with either positive or negative weight (inheriting the weight). The only edges in DB start at positive weight vertices P and end at negative weight vertices N. There is an edge in DB from p to n if and only if there is a path in D from p to n following the arrows. Equivalently, for computational purposes, if there is a push-route in the one-box maze from a square of P to a square of N (in the dynamic case we add the goal frozen boxes to the maze and then replace them with walls to get a "dynamic maze"). There is a flow deadlock in D if and only if there is no way to share out positive weights amongst adjacent negative weight vertices in DB so that all vertices have new weight zero.
If DB has a positive weight vertex with no outgoing edges, that is a trivial overflow deadlock (a goal trap). Similarly, if DB has a negative weight vertex with no incoming edges, that is a trivial underflow deadlock. These are simple cases when (the underlying graph of) DB is disconnected. A search should at least detect these trivial deadlocks.
Treating DB as a graph (mark an edge from W to V whenever there is an arrow from V to W) we can determine the connected components (start with a vertex and add adjacent vertices, then those adjacent to those, and so on). This lets us determine more trivial flow deadlocks if, for any connected component, the sum of all weights is non-zero (as could happen for a soluble puzzle through goal freezing). More generally, DB has a flow deadlock if and only if at least one of its connected components does.
Other trivial flow deadlocks can be detected if a vertex of positive weight P has sum of negative adjacent weights Q where P+Q>0 (overflow) or the analogue for a vertex of negative weight (underflow). However in general it is possible for a flow deadlock in DB to exist without being trivial.
Look at this example puzzle.
##### # @ ###### # $ # ## ##### # ##### ##### # ## ### # # #$ # # # ## # $ $ .. $ # # ## ##### ## ### .# # #### # ## ## # # . # #### ############## . # #####
It is easily solved. However, by freeze-stacking we can reach this position:
##### # @ ###### # $ # ## ##### # ##### ##### # ## ### # # # # # # ## # $ ** $ # # ## ##### ## ### .# # #### # ## ## # # . # #### ############## . # #####
This is a nontrivial flow deadlock. The box to the right can reach any empty goal, in particular the rightmost two, but whichever of the rightmost two it stacks on the other one can't then be reached by a box (underflow). Contrariwise with the leftmost empty goal and the leftmost two boxes (overflow). DB in this case has six vertices: three of weight one (regions containing the boxes not on goals) and three of weight minus one (regions containing the empty goals). DB is connected, bipartite, all weights sum to zero, every box can reach some goal and every goal can be reached by some box.
Let us return to the general case. Excluding the trivial case when DB has no vertices (as D has only zero weights), we can suppose that DB is a nontrivial, connected bipartite digraph (hence the B in DB: the two parts being the positive and negative weight vertices), whose vertex weights are either positive integers or negative integers (no zero weights), whose directed edges go only from positive weight vertices to negative weight vertices and whose sum of vertex weights is zero. Also, every vertex of positive weight P in DB has the sum Q of adjacent vertex (negative) weights such that P+Q<=0, and analogously for negative weight vertices.
Now form a new digraph DB2 by replacing each vertex, of weight W, say, by abs(W) new vertices of weight sign(W) (we can use some of the box or goal squares in the flow region) with edges in DB2 inherited from DB in the obvious way. In DB2 there is a flow deadlock if and only if a maximal matching has cardinality less than the sum of positive weights, and we're back to bipartite matching (though possibly with reduced cardinality, i.e. whenever a region contains both boxes and goals).
Note that a conventional approach to flow deadlock detection (as I understand it! I may be in error) is to use maximal matching in bipartite graphs, for example with the Hopcroft–Karp algorithm or more likely an optimised variant. Here the graph has vertices the union of the box and goal squares, and an edge between a box and goal if and only if there is a push-route in the one-box maze from the box to the goal. This graph is bipartite and a maximal cardinality matching assigns as many boxes uniquely to goals they can reach as possible - there is a flow deadlock if and only if the cardinality is less than the # of goals n. It takes n^2 push-routes to set up the graph (which reduces to n or less if we store results of previously-calculated OBM push-routes assuming only one box is moved from one node to any successor) and Hopcroft–Karp takes O(n^2.5) worst case to find a maximal matching. Thus the matching algorithm is likely dominated by the push-route computations (even with n push-routes, it would be O(n^3) or greater assuming push access calculation dominates man access calculation which is O(n^2)).
Compare this with the flow region digraph method indicated above. For example, most puzzles in the XSokoban 90 have a single static flow region containing boxes and goals, which can be computed offline. For most positions even when there is frozen stacking the digraphs DB and DB2 are null and no further work has to be done to see these positions have no flow deadlock. One need only calculate DB2 and check for dynamic flow deadlock when the frozen-stacking arrangement changes, or the # of boxes within dynamic flow regions changes - otherwise one can look up a previously computed answer. Even puzzles with complex stacking, likely to produce flow deadlocks, will change flow region box numbers rarely. So the flow region digraph method has potential to be faster, though one needs to consider the complexity involved in computing dynamic flow regions and DB2 (likely to dominate the bipartite matching) and the memory used in storing answers. Roughly speaking, the approach using e.g. Hopcroft–Karp is treating the boxes and goals as points and calculating OBM paths between them, and the flow region digraph method works with whole regions and OBM paths between them.
At first glance, computing flow regions looks like O(n^4), with n^2 empty squares and n^2 man access/push access/reversible push access (though we can optimise by only computing regions containing boxes or goals and using empty space in the OBM to shortcut man access by sending the man around the perimeter of the box). To prevent this high cost torpedoing this approach, it may be necessary to precalculate flow regions where possible. For instance, a packing order/reverse search could let us precalculate the dynamic flow regions associated with the stack-freezing arrangements occurring in the (completely ordered) packing order relative to a forward search using this order. We'd have to then not expand out-of-order stack-freezing to avoid dynamic flow region calculation in the forward search.
Note for conventional searches and Sokoban programs
Reverse exclusion and goal traps (perhaps even static reversible regions) should be implemented for offline deadlock detection as an aid to the human solver; whether goal traps should be included in a conventional computer solver (reverse exclusion certainly should be included in the reverse search) depends on the cost-benefit analysis for the solver - a judgement of how much the overhead of checking every position against the goal trap restrictions will slow the search. In my scheme of a non-exhaustive search, all three concepts would definitely be included because even lengthy deadlock detection can be valuable in a limited search; with some memory overhead (storing box locations within traps and reversible regions for each position and updating this data when boxes move) goal trap restriction can be implemented at relatively low time cost during search, though reversible region restrictions would require an offline computation of push routes between all regions that support them so the restrictions could be implemented at relatively low time cost during search.
Depth Corral (definition)
A dictionary with keys squares S belonging to P and values non-negative integers indicating the access depth. Unassigned squares have value -1. Implemented in following algorithm. More generally, called an "access corral" if the values have some other meaning (cf. RFS below).
IADOL (Iterative Access Depth Ordered List) algorithm
Parameters: P, dead squares DS in P, Man access corral M in P, depth D, depth corral DC, DL (depth list of squares belonging to M), TL (total depth ordered list of squares). Implementation: Null list of squares SL. Iterate over each square S in DL. For each square that is adjacent horizontally or vertically to S, and belongs to M, does not belong to DS, and whose depth in DC is unassigned, add that square to SL and to (the end of) DL. If SL is still null, terminate returning TL, DC and D. Otherwise recurse with depth D+1 and DL replaced with SL. Usage and interpretation: Initially called with DC values all unassigned, D zero and DL a list of boxes (intended to be boxes on goals in a solved position which can be pulled from the man's corral). Each square in DL is given depth 0 (value 0 in DC). On termination D is the largest depth found and TL a list of squares in M in order of increasing depth. The depth is interpreted as distance in man moves from all the boxes in the initial DL. Idea: aim to pull the boxes out as far away from the solved position as we can, avoiding dead squares; start by depth ordering empty squares.
CA (corral access for circuits) (definition of circuit and algorithm to compute the # of circuits and circuit access) algorithm
This algorithm also computes connected occupied regions with access lists.
Compute access corrals in a dictionary D for a Sokoban position P. Let Q be a copy of P. Let NCR be the number of connected regions of occupied squares, initially zero. Let NC be the number of circuits, initially zero. Let NOL be the list of occuped squares, initially null. Iterate over squares S of P, skipping those that are A-empty or already in NOL. When an occupied square S is found, add one to NCR, add S to NOL and find the connected occupied squares (recursively or iteratively, spreading out horizontally, vertically and diagonally from the initial occupied square to adjacent occupied squares) not already in NOL, adding each square to NOL and marking the square in Q with the number NCR and adding these squares to a list CRL(NCR) (connected region list for region # NCR). Set NAC (number of access corrals) to zero and a list CRAC(NCR) to null (the connected region access corrals for region # NCR). Now iterate over CRL(NCR) looking for adjacent A-empty squares (horizontally and vertically only). Use D to identify their access corral number. If not in CRAC(NCR) add it. For completeness we can complete CRAC(NCR) but for the purpose of finding circuits (defined as those with len(CRAC(NCR))=1) we can terminate when len(CRAC(NCR))>1. For each circuit, we add one to NC and set CA(NC) (corral access for circuit # NC) to be the single access number in CRAC(NCR). By this means we identify the access circuits. We are usually interested in those for a fixed access number (usually the man-access corral) so we can iterate over NCR to find these.
RFS (Reverse free space) algorithm
Produces a (heuristically) decreasing accessibility-ordered, then lexicographically decreasing depth-ordered, arrangement of extra boxes BL in free space (no increase to the number of corrals).
Step 1
Let DL be a list of squares in P containing boxes on goals which can be pulled at least one square (without forward-relative detectable deadlock). Call IADOL with D=0, unassigned DC, TL a copy of DL. Returns a maximal depth D, increasing depth-ordered list TL and depth corral DC (which for each key S in TL has value the depth of S). Let Q be a copy of P which may vary. Compute the pull access squares for all the boxes in DL. replace TL with an equivalently-ordered sublist containing just those squares belonging to at least one of the pull accesses for the boxes in DL. Idea: aim to pull out the DL boxes as far as we can, but the squares we fix to pull them out to must be pull-accessible, to at least one box.
Step 2
Start with a list of A-empty squares L, and null modified total list MTL of squares (from TL). Iterate through boxes B in DL in a fixed order (assumed not to matter). Compute the pull access for B in Q (assuming Q is not P, otherwise we already did it). Iterate through squares S in TL in reverse order (starting at maximal depth). If S is not in B, skip to next S. Otherwise, if S is not isolated in Q, add S to MTL and skip to next S. Otherwise add a box at S to Q, add S to L and skip to next S. Result: Q stores a position with a number len(L) of extra boxes, each of which is surrounded by empty space, whose squares belong to L. Idea: we've found a depth-ordered arrangement of isolated boxes in free space; as boxes are isolated we can't have increased the number of corrals, though we'll have increased the number of circuits by len(L). This arrangement is particularly accessible as the boxes are isolated.
Step 3
Replace TL by MTL (to look for more box arrangements with more access restrictions). Let BL be a copy of L. Let L be a null list of squares, similarly MTL. Compute the pull access for B in Q. Iterate through squares S in TL in reverse order (starting at maximal depth). If adding a box to S in Q would create a frozen-box deadlock (disregarding the fact that there may be more boxes than goals!) or add to the number of corrals rel. to P, skip to next S. Otherwise, if adding a box to S in Q would decrease the number of circuits rel. to the previous value of Q, add S to MTL and skip to next S. Otherwise add a box at S to Q, add S to L and skip to next S. Result: Q stores a position with a number len(L) of extra boxes, compared to the value of Q at Step 2. Adding these boxes created no frozen-box deadlocks and didn't increase the number of corrals or decrease the number of circuits. The arrangement is depth-ordered and only marginally less accessible than the one at Step 2; we may have adjacent boxes or boxes adjacent to walls.
Step 4
Replace TL by MTL (to look for more box arrangements with more access restrictions). Let BL be a copy of L plus the previous BL. Let L be a null list of squares, similarly MTL. Compute the pull access for B in Q. Iterate through squares S in TL in reverse order (starting at maximal depth). If adding a box to S in Q would create a frozen-box deadlock or add to the number of corrals rel. to P, skip to next S. Otherwise add a box at S to Q, add S to L and skip to next S. Result: Q stores a position with a number len(L) of extra boxes, compared to the value of Q at Step 3. Adding these boxes created no frozen-box deadlocks and didn't increase the number of corrals. The arrangement is depth-ordered and possibly less accessible than the one at Step 2; we may have reduced the number of circuits (perhaps to zero). The algorithm terminates after this step, with the last L added to BL.
Step 5
Repeat Steps 1-4 with external corrals, if they exist, moving the man into each external corral in turn, and calculating free space relative to pullable boxes in each corral.
The next stage for reverse spreadout is to find an order of accessible pull routes pulling out the DL boxes to (some of) the assigned free space squares in RFS. In other words, a reverse variation moving the DL boxes into free space. There may be more or less free space squares from RFS than there are boxes in DL. If more, we bias towards the first len(DL) of the free squares (those with the highest depth and, heuristically, accessibility), holding the remaining free squares in reserve for shuffling manoeuvres. If less, we obviously can only pull out len(BL) boxes at best. However, in the process of doing this, we may open up access to boxes which may have more access to free space in the new arrangement. For example, there might be two corrals in a solved position, and let's suppose all the goals in one room. From one corral (assumed to be one for which the puzzle is reverse soluble) we may not have free space room in the corral to pull out all the N boxes. However, we may have free space enough to pull out enough boxes to make the puzzle have a single corral, which then increases the free space access to the squares which were initially in the second corral, allowing the puzzle to be solved.
RPR (Reverse pull routes, also known as reverse spreadout) algorithm
This is the main algorithm intended for use in the reverse search, for pulling boxes out to a "spread out" position with a single corral and for establishing a stacking order.
Step 1
Start with null list of pull routes V. null list SBL (static box list) and a solved position P (copied to Q which may vary) with pullable boxes PB and apply RFS to get an ordered free square arrangement BL within the given corral. In any (arbitrary) order iterate over the pullable boxes B in PB (note that PB will dynamically vary rather than being fixed; it may increase though never decrease in size). Look for the first square S in BL which belongs to the pull access for B. If there is no such square S, despite there being empty squares in BL, call algorithm RSB (below) to see if we can rearrange previous boxes in free space to find an S that works. If still no S works, skip to the next B, adding B to list SBL. Otherwise compute a pull route to S and add it to V.
Step 2
When we find a pull route to add to V, and change Q by moving the box B along this pull route to free space, we must check whether any new boxes are now pullable (in addition to those in PB) as a result of this move. If so, we add this list of new pullable boxes onto the front of PB (heuristically considering it more important to pull out boxes that were previously inaccessible). We check whether the number of corrals in Q has decreased compared to P. If so, we must first check if there is only one corral and all boxes not moved yet are not (push-)frozen on goals. If so, we terminate the algorithm (returning True for success, V and SBL.). Otherwise, we skip to the next B in Step 1.
Step 3
Eventually we reach the end of the (dynamically varying) PB. So every box has either been moved to a free square or, with no such move being found, had its square added to SBL. Since we didn't already terminate, we must have failed (either there is more than one corral or some boxes are (push-)frozen on goals). So we terminate the algorithm and return False for failure, along with V and SBL to see how far we got. In future, this output could be used to feed other algorithms which try to learn from the partial success achieved (measured in V) and get round the problems represented by SBL. For example, in the two-corral puzzle mentioned above, if free space in the initial corral is limited, it might be necessary to leave unmoved several boxes which could be pulled, and prioritise pulling out previously inaccessible boxes in some special order, so as to fill up the available free space in the first corral at just the moment when access to the second corral is granted with the last box pulled out. A tree search algorithm, scoring previously inaccessible boxes higher, might work - details later.
Note: more generally, we could signal different degrees of success and failure. For example, if we found a position with a single corral, but some of the boxes BL are still push-frozen on goals, (yet some progress has been made in pulling out boxes and/or reducing corrals) we could set this as the aim of the forward search, with the proviso that we may finally stack any box on the goals in BL at any point, but no other boxes are allowed to get frozen on goals not in BL. This generalises the notion of deadlock to take into account stacking order restrictions. The forward search then aims for a position with one corral the only frozen goals in which belong to BL. There would then be an intermediate stage matching up the forward and reverse search positions, either by stacking more boxes onto goals in BL or moving a box from a square in the forward search to one of the squares having boxes in the reverse search. See later for algorithmic details.
RSB (Reverse shuffle boxes) algorithm
We are given a position P, with given man corral, a free space arrangement BL, some of whose squares FL are already occupied by boxes, and a box B which currently can't be pulled to a square in BL despite there being empty squares in BL. Our aim is to rearrange the boxes in FL to squares in BL so that B can be pulled onto an empty square in BL. If so we return True for success and the variation rearranging the boxes and moving B.
Step 1
Iterate over squares S in FL (currently covered by boxes in P). Copy P to Q, and delete the box on S. Compute the pull access for B within the given man corral in Q. If it now contains an empty square in BL (including the newly-empty one S), make a pull route to the first accessible square in BL, call it V, and go to step 3.
Step 2
Skip to the next S. If we run through all S in FL, terminate returning False for failure and a null variation.
Step 3
Make a null list of pull routes W. Iterate over empty squares E in BL other than the destination S2 of V. Let R be a copy of P. Delete from R all the boxes in FL except for S2, where we add a box. If there is a pull route PR from S2 to E in R, make a list L of squares in FL which intersect with PR (here meaning just the squares the box is pulled over, ignoring the man moves), in order of increasing size of the variation to that square in PR, and go to step 4. Otherwise go to step 2.
Step 4
Let P2 be a copy of P. Iterate through squares SB in L in reverse order; for the first one, try to find a pull route to E in P2. If not, go to step 2. If successful, add this pull route to the front of W, and move the box in P2 from the start to the end of this pull route. Now replace E by SB and skip to the next SB in L in reverse order. If we were successful at every SB in L (otherwise must have returned to step 2) our W contains a sequence of pull routes shuffling the boxes in L to free space. Add the pull route V to the front of W. Notice that the position of P2 and P match in terms of V applied to P giving the (last) position P2. Terminate returning True for success and W.
Note: RPR is intended as a quick-and-dirty algorithm which will terminate after a few seconds with modest memory overhead for a medium (and even quite large) puzzle. It finds the first free space arrangement it can (using a distance heuristic, doesn't look for alternatives), moves boxes only by single-box pull routes direct from goal to free space (no intermediate stages and, except for shuffling, the boxes on free space don't move again) and pulls out boxes in an arbitrary and fixed order (doesn't look for alternative orders). It does some shuffling when no direct pull route is found, moving around the free space boxes it has already pulled out to make space for another box, but this shuffling is also limited to single-box pull routes between free space squares and an arbitrary fixed order of boxes and squares found from which ones intersect a (phantom) pull route with free space boxes deleted. It does try potentially several phantom pull routes, but sequentially rather than in a tree, and keeps only the first one for which shuffling was successful. RPR calls RFS and RSB. RFS also calls IADOL.
Such an algorithm will have extremely variable results. It is intended as the first stage of a reverse search, to identify "easy progress" towards the ideal reverse spreadout position with a single corral and no push-frozen boxes on goals. I envisage it being followed up by more lengthy searches to address its shortcomings. For example, a tree search which allows the SBL boxes to move by individual pulls one square at a time, (or not at all except via pull routes direct to free squares if this can reduce len(SBL); a subsequent search could allow boxes in SBL to move by individual pulls once we've minimized len(SBL)) other boxes to move only by single-box pull routes direct to the free space identified by RPR, (only to the first accessible square in the ordered list) and then not to move again with the exception of RPR shuffling. Such a tree search could terminate after a fixed overhead of time and memory. The idea is to make "easy progress" in between more exhaustive searches at the sticking points.
More precisely (sketch of algorithm RPRT, RPR with tree search): (a) RPR produces variations moving goal boxes to free space, possibly via shuffling, returning a list SBL of unmoved boxes. (b) tree search using RPR pull routes direct to free space or via shuffling scores a positive constant times the sum over boxes on free space multiplied by pull layer depth of the box moved there (discounting intermediate shuffling variations - we'd need to keep track of where each box came from by numbering boxes within layers) (pull layer depth is the # of layers of boxes initially pullable from the given man access which have to be iteratively deleted to give access to successive layers of pullable boxes.) SBL boxes which are push-frozen score minus pull layer depth times a positive constant. External corrals score a large negative constant. This may decrease the # of push-frozen boxes in SBL and the number of external corrals. (c) tree search as in (b) but remaining push-frozen SBL boxes (only) get single pulls. This may further decrease the # of push-frozen boxes in SBL and the number of external corrals. (d) tree search as in (c) but all remaining SBL boxes (only) get single pulls (in case the non-frozen SBL boxes need to be manoeuvred out of the way with single pulls to permit more frozen SBL boxes to reach free space). This may further decrease the # of push-frozen boxes in SBL and the number of external corrals. Note that the rationale behind this sequence is to limit exhaustive search (single pulls) by using the smallest # of push-frozen boxes in SBL obtained in the previous steps.
Note that if RPR combined with the tree search(es) is successful, it provides a stacking ordering (actually, with variations and shuffling, even more information than this). This ordering information can then be used to generalise deadlock detection in the forward search (see the previous section) and prune stacking moves that don't correspond to the given order. Even when RPR-plus-trees fails, it may provide a partial stacking order: the goals in SBL can be stacked in any order first, and then the remaining goals in the determined fixed order.
Maze free space
The RFS algorithm generates free space within existing corrals. When applied to a standard Sokoban puzzle such as those in the XSokoban 90 collection, in a solved position the boxes are usually packed tightly together in goal rooms with lots of empty space outside, so even when there are several corrals the utility of the "corral" free space arrangements seems obvious. But if we were to apply the RFS algorithm to the forward search, the start position of the puzzle may be such that there are only a handful of free space positions within the man corral and all adjacent corrals. Thus it might seem to be of limited use. One generalisation which applies to both directions of search is to supplement the corral free space arrangements by deleting all the boxes and then extending the free space arrangements within the new space that opens up. By choosing a square within each sink in the reversible region digraph, we can preserve the property that the free space arrangement is computed from merged pull accesses of each of these sink squares, and thus preserve the depth heuristic in the maze free space arrangement. For the forward search we choose source squares and get the same result.
However, having a maze free space arrangement complicates the situation as now, when we reinstate the boxes, some of the boundary boxes, for example, may already be on free space squares but without having the single corral property of occupied free space arrangements. Also, some boxes may now be adjacent to free space squares, so that if a box is pulled to one of these free spaces it may create a new external corral.
Forward search
Analogous algorithms to RPR and tree searches can be applied to the forward search. Take the pushable boxes with an accessible push square in SP to be the push analogue of PB. The initial aim of forward search is to open up the position to a single corral. The tree search score heuristic could value initially inaccessible boxes higher, as with the reverse search. A large fixed score penalty for each external corral would favour the aim. The way corrals are divided up into free space may be a little different since we would have the free space in a corral from the corral's pushable boxes accessible within the corral, but we'd also have free space from pushing in boundary boxes from a neighbouring corral (the latter combined for all neighbours is the usual reverse free space for the corral).
Relocation subtasks - reversible region digraphs
Assuming the reverse search succeeds in producing a single corral and a complete stacking order, and the forward search succeeds in producing a single corral, we next apply the dynamic reversible digraphs for each stacking order position which results in frozen (or possibly corral deadlock frozen) boxes on goals. We use the reverse variations to identify how many boxes are in each dynamic reversible region at each stage. We then want to match up the forward and reverse variations as far as these region numbers go. We can use free space, RPR and tree search, with fixed score penalties within each region with box number differing from the aim. We could try both forward and reverse relocation in stages matching up the most progress in either direction.
Relocation subtasks - matching forward and reverse positions
Assuming the dynamic reversible relocation is successful at the stage comparing reverse and forward single corral positions before (final) stacking takes place, the next stage is to exactly match the forward and reverse positions. This could be done with the same relocation algorithms, with box positions for the other search ("box aims") added to free space, and score bonuses for positions with boxes on box aims. Note: the rationale behind separating relocation into two stages is that it is easier to match any arrangement of a # of boxes within a region than to match exact positions, but once the regions have the right #s of boxes it may be possible to rearrange them mainly within regions for an exact match - again we're trying to limit the amount of exhaustive search (single pulls). It is also useful to see how far we got at each stage in case of only partial success - further heuristics and algorithms could be tried at each stage.
More on Relocation subtasks
I envisage relocation being carried out with no external corrals for ease of access. However, the possible existence of circuits could make relocation even easier. I don't have a precise alternative algorithm currently making use of circuits. Relocation can still be carried out with external corrals, but more access restrictions make it less likely to succeed.
One-way doors (definition)
A "static one-way door" (s1-door) in the maze M of a puzzle P consists of empty squares C, O, E such that:
- (a) O and E are man-adjacent to C at right angles
- (b) with a box added to M at O there is a push to C
- (c) with a box added to M at C there is a push to O (reversibility)
Note that (b) and (c) require further empty squares either side of O (OP) and C (CP) with all five of these squares distinct.
- (d) with a box added to M at C and at E, and the man at CP, the man access includes O
This means, after a push from O to C as in (b) the man is able to push the box back to O as in (c) ("going through the door") without using the exit E or pushing other boxes.
- (e) neither C nor O are dead squares in P
This means we can have a box in the door without deadlock.
- (f) with a box added to M at E and at O, and the man at C, the man access does not include OP.
This means that after going through the door, without using the exit E or pushing other boxes the man can't reach OP ("one-way condition"). In practice the E square will be empty but there could be further s1-doors blocking man access via the exit E. If their direction of travel is consistent the man can go through the doors in succession and possibly return to OP.
- (g) with a box added to M at C and at O, and the man at OP, the man access includes E
This keeps open the possibility of the man returning to OP after going through the door ("cyclicity").
The case of a two by three rectangular room with perpendicular exits, suitably arranged, gives rise to a s1-door with E the wide side exit and OP the other exit. C stands for closed, O for open and E for exit. Perhaps there should be a further condition to indicate that trying to traverse the s1-door the wrong way gives deadlock. The problem in general is such a deadlock might not be detected - though it might in the case of 2x3 rectangular rooms.
Here is an example of s1-doors. The four corner rooms house occupied s1-doors with boxes on the O-squares (the doors are "open"). The C (closed) and E (exit) squares are marked.
example of one-way doors ##### # # # C$ ########## ##E# # @ # # ## # EC$# # $ # # # $$ $C # # $###.#####E## # ## ....# # ## ##..... # ### # $######e##$ # # CE $ $C EC# # # # # # # ################
One-way system (definition)
A "simple static one-way system" (s1-way) W is an ordered list of at least one s1-door(s) such that:
- (a) if s1-door D'=(C',O',E',CP',OP') follows immediately after s1-door D=(C,O,E,CP,OP) in W, with boxes added to M at O and O' and the man at C, the man access includes OP' but not OP or E'
- (b) if D is the first s1-door in W and D' the last then with boxes added to M at each of the s1-door O (open) squares and the man at OP, the man access includes C'. ("cyclicity")
Note that with boxes as in (b) there are len(W) - 1 external corrals inside W and the man can traverse W by pushing each box from O to C (open to closed) and back again in turn, so going through all the doors, leaving W in the same state. Condition (b) means the man can return to his starting square. It is possible an s1-way forms a loop, in which case an extra condition (a) applies where we consider the first s1 door to follow the last, but in most cases there will be a well-defined start and end.
In the earlier example, the two s1-doors at the top right corner and bottom right corner form an s1-way.
Provided the traversing variation is left available, one can for some purposes treat the boxes in W as if they weren't there, leaving a single corral, as regards man access in a puzzle including W with boxes as in (b). See "one-way access", below for a precise description.
Note: a s1-way might have some of the O-boxes away from the O-squares (on the C squares for instance). To make things simple, so that W doesn't change state when traversed, we'd first traverse W pushing the O-boxes onto O-squares if this is easy - if not W is not correctly occupied for traversal, and if not deadlocked we'd first have to find variations to correctly occupy W. If W is completely empty, variations to fill it would normally be easy, pushing the last O-box in first, and then using the W traversal squares to push in the other O-boxes in reverse order, only provided the first OP square is accessible. But this requires that W is push-traversable in entirety, which may not be true in general. The definition only requires man-traversability and reversible pushes between the adjacent O and C squares. And W might be partially, excessively or incorrectly occupied before we get a chance to use it, and even in the static case with 2x3 rooms the variations to correctly occupy W may not be obvious, due to the possibility of creating a deadlock due to W-external boxes if trying to push a box out of W. So in practice use of one-way systems in a search will be restricted to using the O-squares as targets for boxes and navigating (with macros to do the two pushes to go through the door) when all the boxes are on O-squares.
Note: Despite the s1-door definition being designed to make cyclicity possible, it is possible that an arrangement of boxes on O-squares of a collection of s1-doors does not lead to a "zone cyclic" position where the man can return to his starting square after going through s1-doors. It should at least be possible for the man to reach any zone by going though s1-doors, but if some s1-doors follow others in an inconsistent direction of travel the man could end up stranded.
Note: There may be more general one-way systems or traversable corrals utilising more than a single reversible push of one box to move between corrals. There are, for instance, "valves" which function something like s1-doors but require two boxes to move. There can also be traversable corrals which have a different state (due to boxes not moving back to their original positions) after traversal. There can be intertwined (overlapping) one-way systems. None of this complexity is being addressed here. A computer solution of medium to large puzzles featuring even standard 2x3 room 1-ways significantly, many of which will have too large a tree for conventional solvers, I consider would be a significant advance.
Double-box s1-door
If a position P has two boxes occupying an s1-door (such as one on O or C and one on E) it can't be used for one way traversal until one box (like the one on E) is exited. If the s1-door is push-traversable even with two boxes, by push-reversing one box and pushing out the other to free space, simple variations should normally achieve this converting it to an occupied s1-door. If the s1-door is immediately followed by a vacant (in P) s1-door so the two form an s1-way, using the second O-square as free space makes the s1-way occupied and traversable, thus increasing access (see one-way access, below) without increasing free space occupancy, so it would perhaps be worth doing this in RPR(T).
One-way access (definition and algorithm)
If WL is a list of s1-doors (belonging to one or more s1-ways) with boxes on the O squares, make a dictionary D with keys OP squares and values EP squares. When calculating man access to push a box outside the s1-doors around a push route, when a square in the D keys (which can be more efficiently checked with another dictionary) is accessed, the value is used to add the EP square to man access - this would be accessible by traversing W. The extended access is "one-way access" (1-access) relative to WL. The access can be calculated twice, first without the WL "teleports" to avoid unnecessary use of WL. The relevant variations to traverse the s1-doors are also stored ready to insert into push routes when needed. Note that it's not important which s1-way we use to achieve access (use an optimiser to find the minimal choice).
Note: s1-doors and s1-ways obviously apply to the reverse mode also, with the order in W reversed and pulls replaced by pushes. Note: normally we'd only use the beginning OP to "teleport" the man to the end OP, but in some more general s1-ways than those with 3x2 rooms connected directly by straight tunnels there might be other boxes than the O ones within W so we might want to move them. It is also possible the definition could be extended to dynamic 1-ways (where some of the access obstacles are boxes rather than walls, though it could be problemmatic to work out the level of occupancy by boxes of the 1-way) where other boxes than the O ones within W being moved might change access restrictions in W and it might be possible to move such boxes between different s1-door corrals. Note: it might seem redundant to use one-way access from a position where W is not "end-blocked" (the first and last OP squares are both accessible). However, consider pushing a box outside W around a push route; the box might end-block W en route, and access to change push direction might then require traversing W. This could even occur with W only containing a single s1-door, and thus not contributing external corrals. In this special case using W actually increases access for push (or pull) routes in the single corral situation envisaged for RPR(T).
Assuming that trying to push out O-boxes from W in a puzzle P directly produces deadlock, a human solver can use the 1-way access to rearrange other boxes (temporarily end-blocking W and traversing W to regain it) until clearing W becomes possible. Thus W functions as an adjunct to free space in terms of computer solving, just as arrangements with doors or multiple non-deadlocked corrals may do. I think W traversals, along with rearranging other boxes and temporarily blocking access to the end of W generalises shuffling, though I haven't got a precise algorithm yet. More work is needed on the details of s1-ways before they can be added to SOLVE. At present they could only be used in the case of single s1-doors or push-traversable s1-ways which are easy to occupy or vacate as free space O-squares in RPR(T). Even then it isn't clear how to modify free space from RFS to take account of detected s1-ways; adding the detected O-squares to free space before applying RFS is one option, if we keep the true free space (within a single corral) separate. We'd have to look at the free space arrangements with and without occupied O-squares to be able to still apply the single-corral condition. Variations to occupy and vacate s1-ways where possible with methods similar to RPR(T) are needed. A restriction to reversibly push-traversable s1-ways would be a start so that only W-exiting deadlocks are a problem: which can be solved with a free space external to W and reversibly push-routable to an end of W. This can generalize to push-traversable s1-ways, with a free space outside W pushable to the first O-square and one pushable from the last O-square.
Dynamic 1-doors, 1-ways and 1-access
We can generalize s1-doors to the dynamic case (d1-doors: arbitrary positions and some access restrictions due to boxes other than O-boxes) assuming boxes on O-squares ("open-occupied") by checking that the squares C, CP, and OP are empty before applying a version of the s1-door conditions (a)-(f) to the resulting position rather than the maze:
- (a) O and E are man-adjacent to C at right angles
- (b) the box at O can be pushed to C
- (c) after the push in (b), the box now at C can be pushed to O (reversibility)
Note that (b) and (c) require further empty squares either side of O (OP) and C (CP) with all five of these squares distinct.
- (d) after the push in (b), with the man at O, the man access includes CP
- (e) neither C nor O are dead squares in P
- (f) with the man at OP, the man access does not include C.
These conditions are different than those for s1-doors because we don't know which boxes are acting as obstructions creating s1-door access restrictions. We can't be sure the d1-door is part of a dynamic one-way system. All we can do is see where macros going through the doors takes us after identifying d1-doors and adding them to the s1-doors. Also note there is no dynamic analogue of the s1-door condition (g) - cyclicity is not addressed at all. The d1-door condition (f) is more restrictive than the s1-door condition (f) since we have to enforce the one-way condition and there is no point in moving the O-box if there is already access to C and so the exit E.
Note further that d1-ways have the same definition as s1-ways with s1-doors replaced by d1-doors. Detection of open-occupied d1-doors (potentially with four complex checks for each box in the puzzle) could be a significant overhead we don't want at nodes in the RPR tree searches, so we might want to restrict it to the start or solved positions, and in computing free space if using it at all. Also, boxes determining dynamic access restrictions required for a d1-way could get moved during SOLVE without the program "knowing" - though that just makes things easier so perhaps it can be safely ignored.
In the earlier example there are two d1-doors, one in the room above the goal room top exit, and one in the room below the goal room bottom exit. They're marked as before, the box is on the O-square again. Note the lower case "e" is used to indicate an E-square containing a goal. Also note that there is a d1-way using the two s1-doors at top left and bottom left and the lower d1-door. This position is "zone cyclic" - by going through one-way doors the man can reach every zone and also return from any zone to his starting position.
More on s1-ways and free space
How can we detect and use s1-ways as an adjunct to free space?
First consider how to identify all s1-ways in the maze M. Loop through A-empty squares S in M not belonging to the Reverse exclusion zone (assuming we're aiming to use s1-ways in reverse mode from the solved position for clarity). Make the four possible checks for orientations of s1-door with C square at S and O square an adjacent pull square to S. That finds all s1-doors, from the definition. Add boxes at the O and E squares of all s1-doors and compute all man access corrals, and determine if there is a free space square in each corral. Applying condition (c) of s1-way definition we can determine if each man access corral connects adjacent s1-doors in an s1-way or not, and the order they come in. Putting together this info we find the s1-ways in M. We also can store the access squares trapped in corrals between succeeding s1-doors in the s1-ways.
Now suppose we have a solved position FP with given man corral and compute all man access corrals in P. In calculating free space within each corral, we normally look at A-empty squares S within merged pull access where addition of a box at S (making position P) isn't a detectable deadlock (ignoring the fact that there is one more box than goals) and, when computing all man access corrals within P, the total # of corrals in FP and P is the same.
To add s1-ways, apply the same conditions. So O-squares only get added if they are individually free squares in the usual sense. The change comes when we look for arrangements of many free squares in RFS. So we start with a position P1 with several free space squares occupied by boxes and are looking to add another. Firstly, the s1-door O squares are put at the front of the queue for new squares, so several of the free space boxes in P1 may be O-squares. We compute all man access corrals in P1 (MA(P1)). But we then compute 1-way access WA in the given man corral, using the s1-door "teleports" belonging to free space where they are applicable (meaning, where we can reversibly pull the box on the O square to the C square within man access starting at C). First we require no more 1-way man access corrals (relative to the given corral) than there were in FP. Second, from MA(P1) we move the man to each external corral EC such that the A-empty squares in EC belong to WA in turn. We then calculate 1-way access from there; the next condition is that this includes an A-empty square from the man corral (so we found our way back to the start).
Note that boxes on s1-door O-squares on the boundary between the given corral and an external corral may have to be treated differently. It's not enough to have one such to merge at the level of one-way access; a trip around s1-ways is needed to lead back to the given corral. Such a trip could involve s1-doors within and on the boundary of multiple corrals. So perhaps we should first calculate 1-way access, using all the detected s1-doors whereever they are, from each external corral adjacent to the given corral, seeing which lead back to the given corral to determine which corrals we can "1-way merge". Hopefully RPR(T) will result in boundary boxes being pulled to O-squares in free space (perhaps from adjacent corrals accessed by an sequence of s1-doors), setting up such mergers. So RPR(T) would need to check 1-way access from the end of its pull routes, in case these end in an external corral due to s1 "teleports", and in case the pull route destination is an s1-door O-square.
Future development and problems
The biggest weakness with RPR is probably the arrangements of free squares found by RFS. Better arrangements, packing more boxes into free space, may well exist than are found, and for more difficult puzzles they may be essential to the solution. I have no workable generally-applicable alternatives at present. The next problem is that RPR relies on free squares in the first place; more difficult puzzles may require navigable positions with multiple corrals. A step towards dealing with this extra complexity is to identify and use static one-way systems - those inherent in the arrangement of rooms in the maze, as discussed earlier.
This brings me to another problem with the RFS free arrangements: in the last stage, where circuits are removed, there is an awful lot of choice of ways to do this (much of which might be relatively unimportant in terms of the number of boxes packed into free space). However, some of the choices (due to convenient ordering) are likely to facilitate shuffling better than others, since the free space squares which cut down the circuits are introducing steadily more access restrictions. Again I have no workable alternatives at present, other than the obvious: restrict the number of candidates for free squares to one within straight tunnel sections.
Another way to introduce extra corrals is to generalize the notion of free squares to include "doors" for each extra corral. When checking a square for the free square list, usually we skip it if it increases the number of corrals. But we could check if it creates just one more corral and the square the box was pulled from initially still belongs to the man access, and we have access to the opposite pull square from the last one used in the pull route. We could allow only one such "door" per extra corral, or allow an unlimited number (perhaps setting up significant problems in later matching the reverse and forward search). The idea is that we block off an area by closing the "door", but it is possible to open the door by pulling in the opposite direction later to make the arrangement accessible again. Perhaps this idea should be combined with reversibility : we require the door box to be pullable back to the goal square it started from (at the time of setting up the door, and regardless of which pull square we use to return it beyond it being an accessible one).
A third way is to introduce some exhaustive searching for small rooms where we calculate all the different arrangements. We'd have to check the arrangements were achievable somehow, perhaps by successively peeling off "doors" until we had a single corral, and then put the "best" achievable arrangement (most boxes) into RPR.
A simple implementation of SOLVE
- generate a maze free space arrangement, FA, for the forward search, incorporating static and (some) dynamic one-way door open-squares.
- perform a tree search forward from the start position as deep as resources permit as follows:
- expand single pushes of each box which isn't on a free space square from FA
- expand macros where each box which isn't on a free space square from FA is sent on a single-box push-route to a single one of the empty free space squares of FA it can reach, chosen using a distance heuristic as furthest away (with a random selection as tie-breaker).
- if the box has no push-route macro as above, look for a shuffling macro, as in RPR, with the initial target obtained after deleting the occupied FA boxes chosen using a distance heuristic as closest (to minimise the number of boxes to shuffle).
- expand macros where each box which is on an open-square O of a one-way door in FA with the man having access to OP but not C has the push from O to C followed by the push from C back to A, with the man now in a different zone, forming the macro provided OP, CP and C are empty. Note that the box on O is returned to its original square by this macro, which is intended to promote use of one-way systems in FA. We don't insist that E is empty as one could go through the door to push a box on E out.
- score nodes with a bonus for each occupied free space square.
- give a score penalty for the number of zones above one in the position obtained by deleting boxes on FP open-squares of one-way doors. This heuristic approximates how far from a zone-cyclic position the node is. The current highest scoring node can be assessed more accurately for zone-cyclicity using one-way access.
- retain a highest-scoring node, FN, chosen at random.
- generate a maze free space arrangement, RA, for the reverse search (possibly one for each zone of the solved position).
- perform a tree search in reverse from each solved position with a choice of man corral, which is the reverse analogue of the forward tree search above using RA for free space.
- retain a highest-scoring node, RN, chosen at random from all these reverse searches.
- perform a forward tree search (forward relocation) from FN as deep as resources permit as follows:
- expand single pushes of each box whose square in RN is empty.
- expand macros where each box whose square in RN is empty is sent on a single-box push-route to a single one of the empty squares whose square in RN is occupied, or is in RA, chosen using a distance heuristic as furthest away (with a random selection as tie-breaker).
- expand one-way door macros from RA open-squares as indicated above.
- if the search reaches RN, put together the variations (converting the reverse variations to forward ones) into a solution.
- perform a reverse relocation search from RN to FN, using FA, the analogue to the forward relocation above.
- if the search reaches FN, put together the variations (converting the reverse variations to forward ones) into a solution.
complexity, tree size and branching factor
The above implementation is aimed at a limited search which explores deeper (using macros) than an exhaustive single-push tree search (e.g. depth-first search or DFS) whilst having no worse a branching factor than DFS and exploiting free space arrangements, not only as macro targets, but for their approach to zone-cylicity - which enables a relocation stage with simultaneous access to all empty squares, and hence all boxes. Each of the several searches is permitted to expend more time per node than a DFS, in line with the greater expense of macro single-box push-routes than single pushes, but is intended not to expend significantly more time on nodes than this implies, or to allow a situation where the search slows down, taking increasingly more time for deeper nodes. In short, it aims to be competitive computationally with DFS.
Here are some basic complexity arguments to help justify this claim. First, DFS in Sokoban has a worst-case branching factor of 4N where N is the # of boxes, since each box can push in at most four directions. Thus macros need to be competitive here, which requires only one macro target per box - note that the final push of the macro can be in at most four different directions, with the same 4N maximal branching factor (in the event that the target push-squares belong to different zones). Not moving boxes from targets (except during shuffling and one-way door macros) further reduces the branching of the macros and single pushes as the targets get occupied, which hopefully makes it competitive with DFS (rather than an 8N branching factor which would make the tree too large) - also, the single-zone property of free space arrangements (discounting the one-way systems) tends to make the branching come down to N for macros as all the push-squares of targets tend to belong to a single zone, though admittedly the average branching factor of macros also approaches the maximal one for the remaining boxes not on targets as the position approaches zone-cyclicity. The fact that shuffling macros are only invoked when direct push-route macros fail, thus are in general limited to more densely-packed arrangements, helps to offset the greater expense of shuffling over push-route macros, though of course this is not guaranteed. The furthest distance heuristic for push-route macros aims to help with packing of course, and the shortest distance one for shuffling aims to minimise the # of boxes in shuffling and so minimise the extra expense of shuffling over routes. In more densely-packed situations shuffling will tend to fail, thus mitigating the extra time expense by not leading to an expanded node.
Machine-learning
Obviously SOLVE is limited in effectiveness by the intelligence (or otherwise!) used in selecting macro targets and applying macros. Also, a side-effect of the furthest-distance heuristic is that boxes may be pushed all the way from a source to a sink in the flow digraph, which can make relocation problems hard to solve. In the distant future maybe one could envisage an iterative procedure whereby meta-analysis of the previous SOLVE searches (which you might consider to be learning or high-level planning) leads to successive search iterations dealing more effectively with flow and relocation bottlenecks. Information concerning maximal and minimal numbers of boxes within areas like flow regions during searches could be utilised in this regard. Meta-analysis could also help with those puzzles whose solutions don't conveniently split into the three stages in the order of the SOLVE search, so that the solving method would have some kind of meta-analysis analogue of the flow region digraph, whereby individual subtasks like opening up, relocating, or even closing up, are applied to different areas of the puzzle and there are order restrictions between subtasks and areas.
A possible way to improve free space arrangements is via "deformation" of corrals. Consider boxes on the boundary of the given man corral with a push square in a neighbouring external corral. If pushing such a box away from the man corral is possible without increasing the # of corrals (or a detectable deadlock) we can do this to open up more space in the man corral. We can do this for all suitable boxes on the boundary. Of course, this extra space comes at the cost of restricting free space in neighbouring corrals. Suppose we've applied RPRT in the forward direction and only have partial success in occupying free space and reducing external corrals. Wherever we had the most success in occupying free space, we try deforming outward (pushing) those corral(s) at the cost of neighbours with less occupied free space. I don't have a detailed algorithm but envisage adding deformation pushes as an intermediate between push routes to free squares and exhaustive search. Hopefully we can add to free space in the successful corral(s) and again hopefully that will increase the occupied free space.
Another problem with the the RPR(T) free space is it is calculated at one time for a single position (any solved position in reverse and SP in forward). As variations are applied occupying some of these a priori free space squares, the position will change dramatically with respect to the SBL boxes, some of which may have moved by a sequence of single pulls. This may open up new dynamic free space squares not calculated a priori. For example, in a classic puzzle with many goals packed tightly into a single goal area, pulling out a number of boxes might open previously occupied space in the goal area, which is unnoticed by RPR(T), but might be needed for a solution. We could simply tack on a new free space calculation at the end of RPR(T) and run it again at the end of the previous variation to utilise the goal area space. That may work in some cases but doesn't address the issue of dynamic free space squares opening up fleetingly at any stage of the reverse variation, which may need to be used in more complex puzzles. Trying to deal with this in general by adding all dynamic free space squares in with separate subtasks risks subtasks with partial success multiplying without limit and defeating the purpose of limited search.
Note that the search is in the following fixed sequence (sketch of algorithm SOLVE): (a) open up to single corral with both forward and reverse variations. (b) apply reverse stacking order to find dynamic reversible region box # restrictions and relocate to match #s. (c) relocate to match forward and reverse variations into a solution. Obviously not all puzzles will divide up into this sequence neatly. Apart from one-way systems requiring external corrals, in practice the puzzle may require partial opening up, relocation, closing up, stacking and other variations in arbitrary sequence. This risks subtasks with partial success multiplying without limit and defeating the purpose of limited search.
Another problem is unmatchable forward and reverse variations (which may be related to more subtle dynamic restrictions than I have described). For example, as a human solver and setter I've often found forward and reverse variations which match exactly, except either the man is in the wrong corral or just one box is displaced one square, but requires a pull rather than a push in forward mode to match. Sometimes a little more work to the end of either forward or reverse variation produces a solution but often a completely new approach is necessary, suggesting the forward and reverse variations may be "unmatchable". Perhaps some form of machine learning is required to deal with more difficult puzzles, or perhaps progress can be made with a relatively simple parallel search of multiple subtasks with some connecting heuristics.