DHolland S3

From Sokoban Wiki

Jump to: navigation, search

Contents

Section 3 Algorithms

In this section I attempt to make precise (programmable) some of the ideas in the previous sections.

Suppose that there are N goals. Let P be a Sokoban position.

Isolated square (definition)

A square in P is "isolated" if the eight adjacent squares (horizontally, vertically and diagonally) are all empty.

A-empty (definition)

A square in P is "A-empty" (access empty) if it is either empty (possibly containing a goal but no box) or contains the man. Squares that are not A-empty contain either boxes or walls (possibly boxes on goals).

Man access corral (definition)

A dictionary with keys squares S belonging to P and values -1 (if the man has no access to S) or 1 (if the man does have access to S). Can be implemented iteratively or recursively, starting with the man's square and moving to adjacent squares horizontally and vertically. More generally, external corrals could enter into the dictionary with values of integers above 1 (one for each corral). The A-empty squares in a corral are called the "corral access".

BD: Implementation-wise, it wastes time doing it this way, where the calculation is required to initialize all squares on the board with "-1" (or better with "0") before the flood-filling calculation of the reachable squares. Instead, it's better to use timestamps. This requires that the area reserved for the calculation is global. Each invocation of the calculation increases the timestamp value, and then all the reachable squares are flagged with that value, thereby making the reachable squares distinguishable from the rest. That way, the squares in the global area only need to be initialized to "0" before the first calculation and when the timestamp overflows


One-box maze (definition)

The maze M was defined in Section 2 (delete all the boxes from P to get M). For a square S of P which does not contain a wall, the one-box maze for S is the Sokoban position OBM(S) obtained by adding to M a box on square S.

Dead squares (definition and algorithm)

These are the A-empty squares from which a box can never be pushed to a goal. They are found by iterating over A-empty squares S, forming the one-box maze OBM(S), and computing the push access PA(S,PS) for each push square PS of S in OBM(S). If none of the PA(S,PS) contains goals, S is a dead square. This algorithm and concept is well-known but is reproduced for familiarity of the concepts in other algorithms below.

Frozen boxes

A box B such that: (a) no pushes of B are possible due to blocked push squares (at least one of the left and right squares is blocked by a box or wall, and at least one of the up and down push squares is blocked) and (b) Each box adjacent to B on a push square such that the opposite push square is not a wall satisfies (a) and (b) in place of B is "frozen". Evidently, frozen boxes can never be pushed again and a frozen box not on a goal is a deadlock. This is the familiar, recursive definition (apologies if I oversimplified it), and generalizing is problematic for boxes with pushes since pushes which currently lead to deadlock might not do so after other boxes have moved. Even so, here is a generalization. A box B is "corral deadlock frozen" if either it is frozen or every push P of B satisfies either: (c) After deleting all known frozen or corral deadlock frozen boxes, except B, P is a detectable deadlock, or (d) B is a boundary box of an external corral C. With all other boxes outside C removed, the push square to perform P is man-accessible and after P, the position is a detectable deadlock. Moreover each of the other boundary boxes of C satisfies (d) in place of B. Note that the definition is iterative as we start with frozen boxes and may uncover some corral deadlock frozen boxes with each iteration, so we may need to check each box several times, each time the number of deadlock frozen boxes increases. As well as extending corral deadlock detection one ply, it also applies to frozen corrals with all boxes on goals, which is not a deadlock, but may enable detection of deadlocks as other boxes are pushed adjacent to the corral.

The definition of frozen extends to reverse mode in an obvious way. Pull frozen arrangements often have boxes two pull squares apart and are often a feature (obstacle) in stacking problems. Corral deadlock detection also extends to reverse mode in an obvious way, with the man inside the corral unable to pull his way out. With goals moved to the start position one must exclude from deadlock arrangements with all boxes on goals exactly as for the forward mode. Further note regarding frozen and corral deadlocks that straight tunnel sections can be treated as one square long to generalize.

Reverse exclusion zone and push-reachable zone (definition and algorithm)

In trying to solve a puzzle in reverse by pulling boxes, we must avoid dead squares, since a position reached in reverse with a box on a dead square can never be reached from the start position by pushing boxes (since we exclude the detectable deadlock of pushing a box onto a dead square). In other words, allowing this would prevent the forward and reverse searches from being able to meet. We can generalize this idea to the "reverse exclusion zone". In addition to the dead squares, the reverse exclusion zone is defined to include all A-empty squares S (in the start position SP) such that S does not belong to the SP "push-reachable zone". The algorithm for determining the push-reachable zone PRZ for SP is as follows. Let PRZ be a null list of squares initially. Iterate over the squares B in SP containing boxes. Let OBM(B) be the one-box maze for B. Compute the push access PA(B,PS) for each push square PS of B in OBM(B). For each square S2 in PA(B,PS), if S2 is not in PRZ add S2 to PRZ. Terminate returning PRZ when we've finished all the iterations. A typical example of a square in the reverse exclusion zone which is not a dead square: consider a T-intersection whose single exit (not the paired exits in a line) turns into a bent tunnel which has another exit. Assume the first square S in the bent tunnel adjacent to the T-intersection is A-empty in the start position SP, and that the T-intersection and adjacent section of the bent tunnel (including the corner) contain no goals. Because of the T-intersection, it is impossible to push a box onto S (a wall is on the push square). Yet, in a suitable OBM it is possible to pull a box from the T-intersection onto S, and because of the bent tunnel's other exit also possible to pull a box from elsewhere onto the T-intersection and change the man position ready to pull it into the bent tunnel. So S belongs to the reverse exclusion zone without (necessarily) being a dead square.

Goal trap (definition and algorithm)

A trivial example of a goal trap is a goal in the corner of a room: once a box is pushed onto the goal it is frozen ("trapped"). The notion of goal trap generalizes this idea to the case where the box is not frozen, but cannot reach all of the push-reachable zone PRZ for the start position SP. To determine the goal traps, iterate over goals G in SP. Form the push access PA(G,PS) for each push square PS of G with a push from G in the one-box maze OBM(G). Merge these push-accesses for G into a list GT(G) (the goal trap). Whenever GT(G) is just PRZ it conveys no further information. However, if GT(G) contains fewer than N goals (say N2 goals), this means that any position with more than N2 boxes on squares in GT(G) is a deadlock, since only N2 of these boxes can ever be stacked (on the N2 goals in GT(G)); the remaining extra boxes cannot reach any of the remaining goals since they can only reach squares in GT(G). A typical and simple example would be a goal on the wall of a room where that wall contains no exits and meets directly with two corners of the room (so the whole wall is entirely horizontal or entirely vertical). It is okay to push one box onto this wall, as it could later be stacked on the goal. But it is no good to push two boxes onto this wall, since only one can be stacked on the wall's goal, and the other is then stranded in the goal trap and cannot reach any of the other goals in the puzzle. Thus, the numerical information conveyed by the goal traps is the number of goals they contain, and the implied restriction on how many boxes can be in the goal trap at any time without causing detectable deadlock. Note that corner goals have empty goal trap, conveying no information, but goal traps in general can exist with higher capacity but without containing all N goals.

The goal traps may also convey restrictions for the reverse search. If there are NB (more than zero) boxes inside a goal trap in the start position SP, the reverse search cannot be allowed to remove from the goal trap so many boxes that there are fewer than NB boxes in the trap, otherwise the forward and reverse searches cannot meet. As an example, with the goal on a wall as described above, if the start position has a box on the wall the reverse search cannot be allowed to remove all boxes from the wall.

Reversible regions, sources and sinks

The notion of a directed graph (digraph) of push-reversible regions, with numerical restrictions on the number of boxes within each region, generalizes goal traps. The push-reversible regions are got by restricting push access in one-box mazes to squares to those with a push route back to the starting square. This provides non-intersecting reversible regions which contain all A-empty squares other than dead squares. The digraph is obtained by having a vertex for each reversible region and an arrow from region A to region B if and only if there is a push-route in a one-box maze from (any square of, taken to be the first by reversibility) A to B. A "sink" in a digraph is a vertex with no outgoing arrows (a "source" has no incoming arrows). Note that all sink regions must contain goals to avoid being dead squares. Thus if a sink region contains M goals, any position with more than M boxes in that region is a deadlock, generalizing the forward restriction in goal traps. More generally, for each sink S, count up the goals G in regions which have a route following the arrows in the digraph (hence a push route) to S (including S). Count up the boxes B in these regions. If B is less than G (as could happen if boxes are pushed to a region with no push route to S) the position is a deadlock. It may not be a deadlock if B exceeds G, provided G is less than the total # of goals N, because the extra B-G boxes might be pushed to other goal regions with no push route to S. Similarly, for a source S, if it has more goals than boxes the position is deadlocked. Perhaps the reverse search can have generalized restrictions - not sure what they should be.

These ideas of reversibility and digraphs to show directions for pushing are "static" in that they apply to one-box mazes though the numerical restrictions they imply can be applied to "dynamic" positions. Further dynamic restrictions could be applied once a stacking order is obtained; frozen boxes during stacking could subdivide the static reversible regions, where the frozen boxes are added to the one-box maze to find further restrictions. Such dynamic restrictions could signal "relocation" subtasks - moving a certain number of boxes from one dynamic region to another before a particular phase of the stacking is completed. This would be of great importance for a limited search as the success of the reverse search in finding a stacking order would then automatically generate these relocation subtasks, potentially enabling the solution of puzzles with too much complexity for conventional search. More on relocation later.

Flow deadlock detection

Filling a goal trap with more boxes than goals is an overflow deadlock. If we consider the area of the puzzle outside the goal trap, this area then has fewer boxes than goals so is an underflow deadlock. The (static and dynamic) reversible region digraphs let us generalise these flow deadlocks (as far is possible using static and dynamic OBM access restrictions) and detect them in the search as follows.

For each reversible (for short, flow) region give the digraph vertex a weight equal to the number of boxes (in a search position) minus the number of goals. We can ignore the vertices with zero weight: there will be a flow deadlock if and only if there is no way to share out positive weights amongst negative weight vertices with paths in the digraph from the positive to negative weight vertices following the arrows, so that every vertex gets new weight zero.

Trivially, there will be no flow deadlock if every vertex has zero weight. We can also detect trivial flow deadlocks using disconnectedness, but we need to set up a new digraph which condenses paths to single edges as follows.

The new digraph DB has a vertex for each vertex of the old digraph D with either positive or negative weight (inheriting the weight). The only edges in DB start at positive weight vertices P and end at negative weight vertices N. There is an edge in DB from p to n if and only if there is a path in D from p to n following the arrows. Equivalently, for computational purposes, if there is a push-route in the one-box maze from a square of P to a square of N (in the dynamic case we add the goal frozen boxes to the maze and then replace them with walls to get a "dynamic maze"). There is a flow deadlock in D if and only if there is no way to share out positive weights amongst adjacent negative weight vertices in DB so that all vertices have new weight zero.

If DB has a positive weight vertex with no outgoing edges, that is a trivial overflow deadlock (a goal trap). Similarly, if DB has a negative weight vertex with no incoming edges, that is a trivial underflow deadlock. These are simple cases when (the underlying graph of) DB is disconnected. A search should at least detect these trivial deadlocks.

Treating DB as a graph (mark an edge from W to V whenever there is an arrow from V to W) we can determine the connected components (start with a vertex and add adjacent vertices, then those adjacent to those, and so on). This lets us determine more trivial flow deadlocks if, for any connected component, the sum of all weights is non-zero (as could happen for a soluble puzzle through goal freezing). More generally, DB has a flow deadlock if and only if at least one of its connected components does.

Other trivial flow deadlocks can be detected if a vertex of positive weight P has sum of negative adjacent weights Q where P+Q>0 (overflow) or the analogue for a vertex of negative weight (underflow). However in general it is possible for a flow deadlock in DB to exist without being trivial.

Look at this example puzzle.

    #####
    # @ ######
    # $      #
    ## ##### # #####
#####  #  ## ###   #
#   #$    #   #  # ##
# $    $ .. $    #  #
## ##### ## ###    .#
#  ####   #  ## ##  #
#       .    #   ####
############## . #
             #####

It is easily solved. However, by freeze-stacking we can reach this position:

    #####
    # @ ######
    # $      #
    ## ##### # #####
#####  #  ## ###   #
#   #     #   #  # ##
# $      ** $    #  #
## ##### ## ###    .#
#  ####   #  ## ##  #
#       .    #   ####
############## . #
             #####

This is a nontrivial flow deadlock. The box to the right can reach any empty goal, in particular the rightmost two, but whichever of the rightmost two it stacks on the other one can't then be reached by a box (underflow). Contrariwise with the leftmost empty goal and the leftmost two boxes (overflow). DB in this case has six vertices: three of weight one (regions containing the boxes not on goals) and three of weight minus one (regions containing the empty goals). DB is connected, bipartite, all weights sum to zero, every box can reach some goal and every goal can be reached by some box.

Let us return to the general case. Excluding the trivial case when DB has no vertices (as D has only zero weights), we can suppose that DB is a nontrivial, connected bipartite digraph (hence the B in DB: the two parts being the positive and negative weight vertices), whose vertex weights are either positive integers or negative integers (no zero weights), whose directed edges go only from positive weight vertices to negative weight vertices and whose sum of vertex weights is zero. Also, every vertex of positive weight P in DB has the sum Q of adjacent vertex (negative) weights such that P+Q<=0, and analogously for negative weight vertices.

Now form a new digraph DB2 by replacing each vertex, of weight W, say, by abs(W) new vertices of weight sign(W) (we can use some of the box or goal squares in the flow region) with edges in DB2 inherited from DB in the obvious way. In DB2 there is a flow deadlock if and only if a maximal matching has cardinality less than the sum of positive weights, and we're back to bipartite matching (though possibly with reduced cardinality, i.e. whenever a region contains both boxes and goals).

After this theoretical discussion, let us consider a variety of implementations of flow deadlock detection and the relevant data structures.

The simplest approach is:

  • precalculate the matrix indicating from which position a box can be pushed to which goal (for example:

doesPathExist[positionOfBox][positionOfGoal] = true) The memory used is: one boolean for every matching. Hence, it's a boolean area of size: number of valid box squares (no walls or simple deadlock squares) * number of goals So, for a level having 25 goals and 70 valid box squares the size is: 25

  • 70 / 8 = 219 bytes

In JSoko the calculation also takes the player position into account. Hence the array would be: boolean[4][number valid of box squares][number of goals] This way the array is 4 times larger but with the benefit of being more accurate due to taking into account the current player position.

  • Calculate a first perfect matching.

After this is done every box is assigned to a specific goal it can reach in the start position of the level (taking into account that every goal can only be assigned to one box of course). Example for a level having 3 boxes: box1 is assigned to goal 3 box2 is assigned to goal 2 box3 is assigned to goal1

  • As the box positions change during the search the program must search for a new perfect matching (to be more precise: a maximal matching is computed and if it isn't perfect [not all boxes could be assigned to a specific goal] the situation is a deadlock).

BUT: of course the previously calculated bipartite matching can be reused. It's simple to check whether the old matching is still valid. If just a single box has been relocated the matching is likely to be still valid! So, let's say box3 is pushed to another position. If the box3 can still reach goal1 from this new position the example matching shown above is still valid. In this case there is no need for a new bipartite matching (just because we still have a perfect matching). Therefore it's only if the previously calculated matching isn't valid anymore that a new matching has to be calculated (even if a box gets frozen this needn't result in the current matching being invalid!)

This simple approach will detect all static flow deadlocks and it needs very little memory to store the precalculated Boolean path matrix.

The algorithm used for maximal matching in bipartite graphs can be the Hopcroft–Karp algorithm or more likely an optimised variant. Here the graph has vertices the union of the box and goal squares, and an edge between a box and goal if and only if there is a push-route in the one-box maze from the box to the goal. This graph is bipartite and a maximal cardinality matching assigns as many boxes uniquely to goals they can reach as possible - there is a flow deadlock if and only if the cardinality is less than the # of goals N. Hopcroft–Karp takes O(N^2.5) worst case to find a maximal matching.

The Boolean path matrix can be computed in O(2M(M-1)f(M)) where M is the # of empty squares that aren't known dead squares, and O(f(M)) is the worst-case cost of calculating one path.

However, it is important to detect as many deadlocks as possible. Without using more memory, this approach could make the search unacceptably slow when detecting dynamic flow deadlocks: as it stands a dynamic Boolean path matrix needs to be recalculated every time a box freezes on a goal as the freeze-stacking arrangement acts like extra walls which could make the (static) paths invalid. Due to possible undetected deadlocks, just because the search does some freeze-stacking doesn't mean it is about to find a solution. It may have to back up and explore more freeze-stacking. Without using more memory, the array recalculation would occur even when a particular freeze-stacking arrangement is encountered a second time.

The obvious extension to this approach is to store different freeze-stacking arrangements, just as potential deadlock patterns are stored and re-used in the search, using the same data structure. The index of the freeze-stacking arrangement is added to the Boolean path matrix (for example doesPathExist[freezeStackPatternIndex][positionOfBox][positionOfGoal] = true). Whilst the static array (corresponding to the start position's freeze-stacking arrangement, usually empty) can still be precalculated, the dynamic arrays are dynamically computed during the search as new freeze-stacking arrangements are encountered. Otherwise the method applies as before but now more efficiently deals with all dynamic flow deadlocks at the cost of more memory.

The use of dynamic reversible (flow) regions has some advantages, in that the cardinality of a maximal matching will often be reduced - whenever a flow region contains both boxes and goals. The data structure now includes a flow region hash table: FlowRegion[freezeStackPatternIndex][positionOfSquare]=FlowRegionIndex as well as a flow region path array: doesPathExist[freezeStackPatternIndex][FlowRegionIndex1][FlowRegionIndex2] = true Thus, we now re-use a matching if only one box moves and stays within the same region (as we can determine from the flow region hash table), which is the region analogue of re-using a matching if only one box moves and can still reach the last matched goal.

The real saving comes in reducing the cardinality of a maximal matching: for each region let the weight W be the # of boxes in that region minus the # of goals in it. Note that at most two regions can be affected when only one box moves, even when it moves to another region, so it is simple to update the weighting. If W is positive, we treat the region as having W spare boxes. If W is negative, we treat the region as having -W spare goals. If W is zero it doesn't contribute to the matching. We now use the flow region path array to match boxes (positive W) with goals (negative W) they can reach with a path using the bipartite matching algorithm.

For example, many puzzles in the XSokoban 90 have a single static flow region containing boxes and goals, which can be computed offline. For most positions even when there is frozen stacking this is still true and no further matching work has to be done to see these positions have no flow deadlock. The approach using box paths is treating the boxes and goals as points and calculating OBM paths between them, and the flow region digraph method works with whole regions and OBM paths between them.

Computing flow regions and box path arrays is very expensive and bipartite matching is expensive. To prevent this high cost torpedoing this approach and to prevent the storage cost becoming prohibitive, it may be useful to precalculate selected dynamic flow regions/box path arrays where possible. For instance, a packing order/reverse search could let us precalculate the dynamic flow regions associated with the stack-freezing arrangements occurring in the (completely ordered) packing order relative to a forward search using this order. We'd have to then not expand out-of-order stack-freezing to avoid dynamic flow region calculation in the forward search.

Note for conventional searches and Sokoban programs

Reverse exclusion and goal traps (perhaps even static reversible regions) should be implemented for offline deadlock detection as an aid to the human solver; whether goal traps should be included in a conventional computer solver (reverse exclusion certainly should be included in the reverse search) depends on the cost-benefit analysis for the solver - a judgement of how much the overhead of checking every position against the goal trap restrictions will slow the search. In my scheme of a non-exhaustive search, all three concepts would definitely be included because even lengthy deadlock detection can be valuable in a limited search; with some memory overhead (storing box locations within traps and reversible regions for each position and updating this data when boxes move) goal trap restriction can be implemented at relatively low time cost during search, though reversible region restrictions would require an offline computation of push routes between all regions that support them so the restrictions could be implemented at relatively low time cost during search.

Dynamic reversible regions, if implemented as indicated, have the side-effect of not calculating a lower bound for use in finding a push-minimal solution, since we're trying to eliminate the per-node cost of doing so. As stated in the introduction, I'm only looking for any solution to difficult puzzles and if found an optimiser can shorten it, but this is a significant issue for existing searches.

Depth Corral (definition)

A dictionary with keys squares S belonging to P and values non-negative integers indicating the access depth. Unassigned squares have value -1. Implemented in following algorithm. More generally, called an "access corral" if the values have some other meaning (cf. RFS below).

IADOL (Iterative Access Depth Ordered List) algorithm

Parameters: P, dead squares DS in P, Man access corral M in P, depth D, depth corral DC, DL (depth list of squares belonging to M), TL (total depth ordered list of squares). Implementation: Null list of squares SL. Iterate over each square S in DL. For each square that is adjacent horizontally or vertically to S, and belongs to M, does not belong to DS, and whose depth in DC is unassigned, add that square to SL and to (the end of) DL. If SL is still null, terminate returning TL, DC and D. Otherwise recurse with depth D+1 and DL replaced with SL. Usage and interpretation: Initially called with DC values all unassigned, D zero and DL a list of boxes (intended to be boxes on goals in a solved position which can be pulled from the man's corral). Each square in DL is given depth 0 (value 0 in DC). On termination D is the largest depth found and TL a list of squares in M in order of increasing depth. The depth is interpreted as distance in man moves from all the boxes in the initial DL. Idea: aim to pull the boxes out as far away from the solved position as we can, avoiding dead squares; start by depth ordering empty squares.

CA (corral access for circuits) (definition of circuit and algorithm to compute the # of circuits and circuit access) algorithm

This algorithm also computes connected occupied regions with access lists.

Compute access corrals in a dictionary D for a Sokoban position P. Let Q be a copy of P. Let NCR be the number of connected regions of occupied squares, initially zero. Let NC be the number of circuits, initially zero. Let NOL be the list of occupied squares, initially null. Iterate over squares S of P, skipping those that are A-empty or already in NOL. When an occupied square S is found, add one to NCR, add S to NOL and find the connected occupied squares (recursively or iteratively, spreading out horizontally, vertically and diagonally from the initial occupied square to adjacent occupied squares) not already in NOL, adding each square to NOL and marking the square in Q with the number NCR and adding these squares to a list CRL(NCR) (connected region list for region # NCR). Set NAC (number of access corrals) to zero and a list CRAC(NCR) to null (the connected region access corrals for region # NCR). Now iterate over CRL(NCR) looking for adjacent A-empty squares (horizontally and vertically only). Use D to identify their access corral number. If not in CRAC(NCR) add it. For completeness we can complete CRAC(NCR) but for the purpose of finding circuits (defined as those with len(CRAC(NCR))=1) we can terminate when len(CRAC(NCR))>1. For each circuit, we add one to NC and set CA(NC) (corral access for circuit # NC) to be the single access number in CRAC(NCR). By this means we identify the access circuits. We are usually interested in those for a fixed access number (usually the man-access corral) so we can iterate over NCR to find these.

RFS (Reverse free space) algorithm

Produces a (heuristically) decreasing accessibility-ordered, then lexicographically decreasing depth-ordered, arrangement of extra boxes BL in free space (no increase to the number of corrals).

Step 1

Let DL be a list of squares in P containing boxes on goals which can be pulled at least one square (without forward-relative detectable deadlock). Call IADOL with D=0, unassigned DC, TL a copy of DL. Returns a maximal depth D, increasing depth-ordered list TL and depth corral DC (which for each key S in TL has value the depth of S). Let Q be a copy of P which may vary. Compute the pull access squares for all the boxes in DL. Replace TL with an equivalently-ordered sublist containing just those squares belonging to at least one of the pull accesses for the boxes in DL. Idea: aim to pull out the DL boxes as far as we can, but the squares we fix to pull them out to must be pull-accessible, to at least one box.

Step 2

Start with a list of A-empty squares L, and null modified total list MTL of squares (from TL). Iterate through boxes B in DL in a fixed order (assumed not to matter). Compute the pull access for B in Q (assuming Q is not P, otherwise we already did it). Iterate through squares S in TL in reverse order (starting at maximal depth). If S is not in B, skip to next S. Otherwise, if S is not isolated in Q, add S to MTL and skip to next S. Otherwise add a box at S to Q, add S to L and skip to next S. Result: Q stores a position with a number len(L) of extra boxes, each of which is surrounded by empty space, whose squares belong to L. Idea: we've found a depth-ordered arrangement of isolated boxes in free space; as boxes are isolated we can't have increased the number of corrals, though we'll have increased the number of circuits by len(L). This arrangement is particularly accessible as the boxes are isolated.

Step 3

Replace TL by MTL (to look for more box arrangements with more access restrictions). Let BL be a copy of L. Let L be a null list of squares, similarly MTL. Compute the pull access for B in Q. Iterate through squares S in TL in reverse order (starting at maximal depth). If adding a box to S in Q would create a frozen-box deadlock (disregarding the fact that there may be more boxes than goals!) or add to the number of corrals rel. to P, skip to next S. Otherwise, if adding a box to S in Q would decrease the number of circuits rel. to the previous value of Q, add S to MTL and skip to next S. Otherwise add a box at S to Q, add S to L and skip to next S. Result: Q stores a position with a number len(L) of extra boxes, compared to the value of Q at Step 2. Adding these boxes created no frozen-box deadlocks and didn't increase the number of corrals or decrease the number of circuits. The arrangement is depth-ordered and only marginally less accessible than the one at Step 2; we may have adjacent boxes or boxes adjacent to walls.

Step 4

Replace TL by MTL (to look for more box arrangements with more access restrictions). Let BL be a copy of L plus the previous BL. Let L be a null list of squares, similarly MTL. Compute the pull access for B in Q. Iterate through squares S in TL in reverse order (starting at maximal depth). If adding a box to S in Q would create a frozen-box deadlock or add to the number of corrals rel. to P, skip to next S. Otherwise add a box at S to Q, add S to L and skip to next S. Result: Q stores a position with a number len(L) of extra boxes, compared to the value of Q at Step 3. Adding these boxes created no frozen-box deadlocks and didn't increase the number of corrals. The arrangement is depth-ordered and possibly less accessible than the one at Step 2; we may have reduced the number of circuits (perhaps to zero). The algorithm terminates after this step, with the last L added to BL.

Step 5

Repeat Steps 1-4 with external corrals, if they exist, moving the man into each external corral in turn, and calculating free space relative to pullable boxes in each corral.

The next stage for reverse spreadout is to find an order of accessible pull routes pulling out the DL boxes to (some of) the assigned free space squares in RFS. In other words, a reverse variation moving the DL boxes into free space. There may be more or less free space squares from RFS than there are boxes in DL. If more, we bias towards the first len(DL) of the free squares (those with the highest depth and, heuristically, accessibility), holding the remaining free squares in reserve for shuffling manoeuvres. If less, we obviously can only pull out len(BL) boxes at best. However, in the process of doing this, we may open up access to boxes which may have more access to free space in the new arrangement. For example, there might be two corrals in a solved position, and let's suppose all the goals in one room. From one corral (assumed to be one for which the puzzle is reverse soluble) we may not have free space room in the corral to pull out all the N boxes. However, we may have free space enough to pull out enough boxes to make the puzzle have a single corral, which then increases the free space access to the squares which were initially in the second corral, allowing the puzzle to be solved.

RPR (Reverse pull routes, also known as reverse spreadout) algorithm

This is the main algorithm intended for use in the reverse search, for pulling boxes out to a "spread out" position with a single corral and for establishing a stacking order.

Step 1

Start with null list of pull routes V. null list SBL (static box list) and a solved position P (copied to Q which may vary) with pullable boxes PB and apply RFS to get an ordered free square arrangement BL within the given corral. In any (arbitrary) order iterate over the pullable boxes B in PB (note that PB will dynamically vary rather than being fixed; it may increase though never decrease in size). Look for the first square S in BL which belongs to the pull access for B. If there is no such square S, despite there being empty squares in BL, call algorithm RSB (below) to see if we can rearrange previous boxes in free space to find an S that works. If still no S works, skip to the next B, adding B to list SBL. Otherwise compute a pull route to S and add it to V.

Step 2

When we find a pull route to add to V, and change Q by moving the box B along this pull route to free space, we must check whether any new boxes are now pullable (in addition to those in PB) as a result of this move. If so, we add this list of new pullable boxes onto the front of PB (heuristically considering it more important to pull out boxes that were previously inaccessible). We check whether the number of corrals in Q has decreased compared to P. If so, we must first check if there is only one corral and all boxes not moved yet are not (push-)frozen on goals. If so, we terminate the algorithm (returning True for success, V and SBL.). Otherwise, we skip to the next B in Step 1.

Step 3

Eventually we reach the end of the (dynamically varying) PB. So every box has either been moved to a free square or, with no such move being found, had its square added to SBL. Since we didn't already terminate, we must have failed (either there is more than one corral or some boxes are (push-)frozen on goals). So we terminate the algorithm and return False for failure, along with V and SBL to see how far we got. In future, this output could be used to feed other algorithms which try to learn from the partial success achieved (measured in V) and get round the problems represented by SBL. For example, in the two-corral puzzle mentioned above, if free space in the initial corral is limited, it might be necessary to leave unmoved several boxes which could be pulled, and prioritise pulling out previously inaccessible boxes in some special order, so as to fill up the available free space in the first corral at just the moment when access to the second corral is granted with the last box pulled out. A tree search algorithm, scoring previously inaccessible boxes higher, might work - details later.

Note: more generally, we could signal different degrees of success and failure. For example, if we found a position with a single corral, but some of the boxes BL are still push-frozen on goals, (yet some progress has been made in pulling out boxes and/or reducing corrals) we could set this as the aim of the forward search, with the proviso that we may finally stack any box on the goals in BL at any point, but no other boxes are allowed to get frozen on goals not in BL. This generalises the notion of deadlock to take into account stacking order restrictions. The forward search then aims for a position with one corral the only frozen goals in which belong to BL. There would then be an intermediate stage matching up the forward and reverse search positions, either by stacking more boxes onto goals in BL or moving a box from a square in the forward search to one of the squares having boxes in the reverse search. See later for algorithmic details.

RSB (Reverse shuffle boxes) algorithm

We are given a position P, with given man corral, a free space arrangement BL, some of whose squares FL are already occupied by boxes, and a box B which currently can't be pulled to a square in BL despite there being empty squares in BL. Our aim is to rearrange the boxes in FL to squares in BL so that B can be pulled onto an empty square in BL. If so we return True for success and the variation rearranging the boxes and moving B.

Step 1

Iterate over squares S in FL (currently covered by boxes in P). Copy P to Q, and delete the box on S. Compute the pull access for B within the given man corral in Q. If it now contains an empty square in BL (including the newly-empty one S), make a pull route to the first accessible square in BL, call it V, and go to step 3.

Step 2

Skip to the next S. If we run through all S in FL, terminate returning False for failure and a null variation.

Step 3

Make a null list of pull routes W. Iterate over empty squares E in BL other than the destination S2 of V. Let R be a copy of P. Delete from R all the boxes in FL except for S2, where we add a box. If there is a pull route PR from S2 to E in R, make a list L of squares in FL which intersect with PR (here meaning just the squares the box is pulled over, ignoring the man moves), in order of increasing size of the variation to that square in PR, and go to step 4. Otherwise go to step 2.

Step 4

Let P2 be a copy of P. Iterate through squares SB in L in reverse order; for the first one, try to find a pull route to E in P2. If not, go to step 2. If successful, add this pull route to the front of W, and move the box in P2 from the start to the end of this pull route. Now replace E by SB and skip to the next SB in L in reverse order. If we were successful at every SB in L (otherwise must have returned to step 2) our W contains a sequence of pull routes shuffling the boxes in L to free space. Add the pull route V to the front of W. Notice that the position of P2 and P match in terms of V applied to P giving the (last) position P2. Terminate returning True for success and W.

Note: RPR is intended as a quick-and-dirty algorithm which will terminate after a few seconds with modest memory overhead for a medium (and even quite large) puzzle. It finds the first free space arrangement it can (using a distance heuristic, doesn't look for alternatives), moves boxes only by single-box pull routes direct from goal to free space (no intermediate stages and, except for shuffling, the boxes on free space don't move again) and pulls out boxes in an arbitrary and fixed order (doesn't look for alternative orders). It does some shuffling when no direct pull route is found, moving around the free space boxes it has already pulled out to make space for another box, but this shuffling is also limited to single-box pull routes between free space squares and an arbitrary fixed order of boxes and squares found from which ones intersect a (phantom) pull route with free space boxes deleted. It does try potentially several phantom pull routes, but sequentially rather than in a tree, and keeps only the first one for which shuffling was successful. RPR calls RFS and RSB. RFS also calls IADOL.

Such an algorithm will have extremely variable results. It is intended as the first stage of a reverse search, to identify "easy progress" towards the ideal reverse spreadout position with a single corral and no push-frozen boxes on goals. I envisage it being followed up by more lengthy searches to address its shortcomings. For example, a tree search which allows the SBL boxes to move by individual pulls one square at a time, (or not at all except via pull routes direct to free squares if this can reduce len(SBL); a subsequent search could allow boxes in SBL to move by individual pulls once we've minimized len(SBL)) other boxes to move only by single-box pull routes direct to the free space identified by RPR, (only to the first accessible square in the ordered list) and then not to move again with the exception of RPR shuffling. Such a tree search could terminate after a fixed overhead of time and memory. The idea is to make "easy progress" in between more exhaustive searches at the sticking points.

More precisely (sketch of algorithm RPRT, RPR with tree search): (a) RPR produces variations moving goal boxes to free space, possibly via shuffling, returning a list SBL of unmoved boxes. (b) tree search using RPR pull routes direct to free space or via shuffling scores a positive constant times the sum over boxes on free space multiplied by pull layer depth of the box moved there (discounting intermediate shuffling variations - we'd need to keep track of where each box came from by numbering boxes within layers) (pull layer depth is the # of layers of boxes initially pullable from the given man access which have to be iteratively deleted to give access to successive layers of pullable boxes.) SBL boxes which are push-frozen score minus pull layer depth times a positive constant. External corrals score a large negative constant. This may decrease the # of push-frozen boxes in SBL and the number of external corrals. (c) tree search as in (b) but remaining push-frozen SBL boxes (only) get single pulls. This may further decrease the # of push-frozen boxes in SBL and the number of external corrals. (d) tree search as in (c) but all remaining SBL boxes (only) get single pulls (in case the non-frozen SBL boxes need to be manoeuvred out of the way with single pulls to permit more frozen SBL boxes to reach free space). This may further decrease the # of push-frozen boxes in SBL and the number of external corrals. Note that the rationale behind this sequence is to limit exhaustive search (single pulls) by using the smallest # of push-frozen boxes in SBL obtained in the previous steps.

Note that if RPR combined with the tree search(es) is successful, it provides a stacking ordering (actually, with variations and shuffling, even more information than this). This ordering information can then be used to generalise deadlock detection in the forward search (see the previous section) and prune stacking moves that don't correspond to the given order. Even when RPR-plus-trees fails, it may provide a partial stacking order: the goals in SBL can be stacked in any order first, and then the remaining goals in the determined fixed order.

Maze free space

The RFS algorithm generates free space within existing corrals. When applied to a standard Sokoban puzzle such as those in the XSokoban 90 collection, in a solved position the boxes are usually packed tightly together in goal rooms with lots of empty space outside, so even when there are several corrals the utility of the "corral" free space arrangements seems obvious. But if we were to apply the RFS algorithm to the forward search, the start position of the puzzle may be such that there are only a handful of free space positions within the man corral and all adjacent corrals. Thus it might seem to be of limited use. One generalisation which applies to both directions of search is to supplement the corral free space arrangements by deleting all the boxes and then extending the free space arrangements within the new space that opens up. By choosing a square within each sink in the reversible region digraph, we can preserve the property that the free space arrangement is computed from merged pull accesses of each of these sink squares, and thus preserve the depth heuristic in the maze free space arrangement. For the forward search we choose source squares and get the same result.

However, having a maze free space arrangement complicates the situation as now, when we reinstate the boxes, some of the boundary boxes, for example, may already be on free space squares but without having the single corral property of occupied free space arrangements. Also, some boxes may now be adjacent to free space squares, so that if a box is pulled to one of these free spaces it may create a new external corral.

Forward search

Analogous algorithms to RPR and tree searches can be applied to the forward search. Take the pushable boxes with an accessible push square in SP to be the push analogue of PB. The initial aim of forward search is to open up the position to a single corral. The tree search score heuristic could value initially inaccessible boxes higher, as with the reverse search. A large fixed score penalty for each external corral would favour the aim. The way corrals are divided up into free space may be a little different since we would have the free space in a corral from the corral's pushable boxes accessible within the corral, but we'd also have free space from pushing in boundary boxes from a neighbouring corral (the latter combined for all neighbours is the usual reverse free space for the corral).

Relocation subtasks - reversible region digraphs

Assuming the reverse search succeeds in producing a single corral and a complete stacking order, and the forward search succeeds in producing a single corral, we next apply the dynamic reversible digraphs for each stacking order position which results in frozen (or possibly corral deadlock frozen) boxes on goals. We use the reverse variations to identify how many boxes are in each dynamic reversible region at each stage. We then want to match up the forward and reverse variations as far as these region numbers go. We can use free space, RPR and tree search, with fixed score penalties within each region with box number differing from the aim. We could try both forward and reverse relocation in stages matching up the most progress in either direction.

Relocation subtasks - matching forward and reverse positions

Assuming the dynamic reversible relocation is successful at the stage comparing reverse and forward single corral positions before (final) stacking takes place, the next stage is to exactly match the forward and reverse positions. This could be done with the same relocation algorithms, with box positions for the other search ("box aims") added to free space, and score bonuses for positions with boxes on box aims. Note: the rationale behind separating relocation into two stages is that it is easier to match any arrangement of a # of boxes within a region than to match exact positions, but once the regions have the right #s of boxes it may be possible to rearrange them mainly within regions for an exact match - again we're trying to limit the amount of exhaustive search (single pulls). It is also useful to see how far we got at each stage in case of only partial success - further heuristics and algorithms could be tried at each stage.

More on Relocation subtasks

I envisage relocation being carried out with no external corrals for ease of access. However, the possible existence of circuits could make relocation even easier. I don't have a precise alternative algorithm currently making use of circuits. Relocation can still be carried out with external corrals, but more access restrictions make it less likely to succeed.

One-way doors (definition)

A "static one-way door" (s1-door) in the maze M of a puzzle P consists of empty squares C, O, E such that:

  • (a) O and E are man-adjacent to C at right angles
  • (b) with a box added to M at O there is a push to C
  • (c) with a box added to M at C there is a push to O (reversibility)

Note that (b) and (c) require further empty squares either side of O (OP) and C (CP) with all five of these squares distinct.

  • (d) with a box added to M at C and at E, and the man at CP, the man access includes O

This means, after a push from O to C as in (b) the man is able to push the box back to O as in (c) ("going through the door") without using the exit E or pushing other boxes.

  • (e) neither C nor O are dead squares in P

This means we can have a box in the door without deadlock.

  • (f) with a box added to M at E and at O, and the man at C, the man access does not include OP.

This means that after going through the door, without using the exit E or pushing other boxes the man can't reach OP ("one-way condition"). In practice the E square will be empty but there could be further s1-doors blocking man access via the exit E. If their direction of travel is consistent the man can go through the doors in succession and possibly return to OP.

  • (g) with a box added to M at C and at O, and the man at OP, the man access includes E

This keeps open the possibility of the man returning to OP after going through the door ("cyclicity").

The case of a two by three rectangular room with perpendicular exits, suitably arranged, gives rise to a s1-door with E the wide side exit and OP the other exit. C stands for closed, O for open and E for exit. Perhaps there should be a further condition to indicate that trying to traverse the s1-door the wrong way gives deadlock. The problem in general is such a deadlock might not be detected - though it might in the case of 2x3 rectangular rooms.

Here is an example of s1-doors. The four corner rooms house occupied s1-doors with boxes on the O-squares (the doors are "open"). The C (closed) and E (exit) squares are marked.

example of one-way doors
#####
#   #
# C$ ##########
##E# #  @ #   #
## # EC$# # $ #
#  # $$    $C #
# $###.#####E##
#  ## ....# #
## ##..... # ###
# $######e##$  #
# CE $  $C EC# #
#  #   #   #   #
################

One-way system (definition)

A "simple static one-way system" (s1-way) W is an ordered list of at least one s1-door(s) such that:

  • (a) if s1-door D'=(C',O',E',CP',OP') follows immediately after s1-door D=(C,O,E,CP,OP) in W, with boxes added to M at O and O' and the man at C, the man access includes OP' but not OP or E'
  • (b) if D is the first s1-door in W and D' the last then with boxes added to M at each of the s1-door O (open) squares and the man at OP, the man access includes C'. ("cyclicity")

Note that with boxes as in (b) there are len(W) - 1 external corrals inside W and the man can traverse W by pushing each box from O to C (open to closed) and back again in turn, so going through all the doors, leaving W in the same state. Condition (b) means the man can return to his starting square. It is possible an s1-way forms a loop, in which case an extra condition (a) applies where we consider the first s1 door to follow the last, but in most cases there will be a well-defined start and end.

In the earlier example, the two s1-doors at the top right corner and bottom right corner form an s1-way.

Provided the traversing variation is left available, one can for some purposes treat the boxes in W as if they weren't there, leaving a single corral, as regards man access in a puzzle including W with boxes as in (b). See "one-way access", below for a precise description.

Note: a s1-way might have some of the O-boxes away from the O-squares (on the C squares for instance). To make things simple, so that W doesn't change state when traversed, we'd first traverse W pushing the O-boxes onto O-squares if this is easy - if not W is not correctly occupied for traversal, and if not deadlocked we'd first have to find variations to correctly occupy W. If W is completely empty, variations to fill it would normally be easy, pushing the last O-box in first, and then using the W traversal squares to push in the other O-boxes in reverse order, only provided the first OP square is accessible. But this requires that W is push-traversable in entirety, which may not be true in general. The definition only requires man-traversability and reversible pushes between the adjacent O and C squares. And W might be partially, excessively or incorrectly occupied before we get a chance to use it, and even in the static case with 2x3 rooms the variations to correctly occupy W may not be obvious, due to the possibility of creating a deadlock due to W-external boxes if trying to push a box out of W. So in practice use of one-way systems in a search will be restricted to using the O-squares as targets for boxes and navigating (with macros to do the two pushes to go through the door) when all the boxes are on O-squares.

Note: Despite the s1-door definition being designed to make cyclicity possible, it is possible that an arrangement of boxes on O-squares of a collection of s1-doors does not lead to a "zone cyclic" position where the man can return to his starting square after going through s1-doors. It should at least be possible for the man to reach any zone by going though s1-doors, but if some s1-doors follow others in an inconsistent direction of travel the man could end up stranded.

Note: There may be more general one-way systems or traversable corrals utilising more than a single reversible push of one box to move between corrals. There are, for instance, "valves" which function something like s1-doors but require two boxes to move. There can also be traversable corrals which have a different state (due to boxes not moving back to their original positions) after traversal. There can be intertwined (overlapping) one-way systems. None of this complexity is being addressed here. A computer solution of medium to large puzzles featuring even standard 2x3 room 1-ways significantly, many of which will have too large a tree for conventional solvers, I consider would be a significant advance.

Double-box s1-door

If a position P has two boxes occupying an s1-door (such as one on O or C and one on E) it can't be used for one way traversal until one box (like the one on E) is exited. If the s1-door is push-traversable even with two boxes, by push-reversing one box and pushing out the other to free space, simple variations should normally achieve this converting it to an occupied s1-door. If the s1-door is immediately followed by a vacant (in P) s1-door so the two form an s1-way, using the second O-square as free space makes the s1-way occupied and traversable, thus increasing access (see one-way access, below) without increasing free space occupancy, so it would perhaps be worth doing this in RPR(T).

One-way access (definition and algorithm)

If WL is a list of s1-doors (belonging to one or more s1-ways) with boxes on the O squares, make a dictionary D with keys OP squares and values EP squares. When calculating man access to push a box outside the s1-doors around a push route, when a square in the D keys (which can be more efficiently checked with another dictionary) is accessed, the value is used to add the EP square to man access - this would be accessible by traversing W. The extended access is "one-way access" (1-access) relative to WL. The access can be calculated twice, first without the WL "teleports" to avoid unnecessary use of WL. The relevant variations to traverse the s1-doors are also stored ready to insert into push routes when needed. Note that it's not important which s1-way we use to achieve access (use an optimiser to find the minimal choice).

Note: s1-doors and s1-ways obviously apply to the reverse mode also, with the order in W reversed and pulls replaced by pushes. Note: normally we'd only use the beginning OP to "teleport" the man to the end OP, but in some more general s1-ways than those with 3x2 rooms connected directly by straight tunnels there might be other boxes than the O ones within W so we might want to move them. It is also possible the definition could be extended to dynamic 1-ways (where some of the access obstacles are boxes rather than walls, though it could be problematic to work out the level of occupancy by boxes of the 1-way) where other boxes than the O ones within W being moved might change access restrictions in W and it might be possible to move such boxes between different s1-door corrals. Note: it might seem redundant to use one-way access from a position where W is not "end-blocked" (the first and last OP squares are both accessible). However, consider pushing a box outside W around a push route; the box might end-block W en route, and access to change push direction might then require traversing W. This could even occur with W only containing a single s1-door, and thus not contributing external corrals. In this special case using W actually increases access for push (or pull) routes in the single corral situation envisaged for RPR(T).

Assuming that trying to push out O-boxes from W in a puzzle P directly produces deadlock, a human solver can use the 1-way access to rearrange other boxes (temporarily end-blocking W and traversing W to regain it) until clearing W becomes possible. Thus W functions as an adjunct to free space in terms of computer solving, just as arrangements with doors or multiple non-deadlocked corrals may do. I think W traversals, along with rearranging other boxes and temporarily blocking access to the end of W generalises shuffling, though I haven't got a precise algorithm yet. More work is needed on the details of s1-ways before they can be added to SOLVE. At present they could only be used in the case of single s1-doors or push-traversable s1-ways which are easy to occupy or vacate as free space O-squares in RPR(T). Even then it isn't clear how to modify free space from RFS to take account of detected s1-ways; adding the detected O-squares to free space before applying RFS is one option, if we keep the true free space (within a single corral) separate. We'd have to look at the free space arrangements with and without occupied O-squares to be able to still apply the single-corral condition. Variations to occupy and vacate s1-ways where possible with methods similar to RPR(T) are needed. A restriction to reversibly push-traversable s1-ways would be a start so that only W-exiting deadlocks are a problem: which can be solved with a free space external to W and reversibly push-routable to an end of W. This can generalize to push-traversable s1-ways, with a free space outside W pushable to the first O-square and one pushable from the last O-square.

Dynamic 1-doors, 1-ways and 1-access

We can generalize s1-doors to the dynamic case (d1-doors: arbitrary positions and some access restrictions due to boxes other than O-boxes) assuming boxes on O-squares ("open-occupied") by checking that the squares C, CP, and OP are empty before applying a version of the s1-door conditions (a)-(f) to the resulting position rather than the maze:

  • (a) O and E are man-adjacent to C at right angles
  • (b) the box at O can be pushed to C
  • (c) after the push in (b), the box now at C can be pushed to O (reversibility)

Note that (b) and (c) require further empty squares either side of O (OP) and C (CP) with all five of these squares distinct.

  • (d) after the push in (b), with the man at O, the man access includes CP
  • (e) neither C nor O are dead squares in P
  • (f) with the man at OP, the man access does not include C.

These conditions are different than those for s1-doors because we don't know which boxes are acting as obstructions creating s1-door access restrictions. We can't be sure the d1-door is part of a dynamic one-way system. All we can do is see where macros going through the doors takes us after identifying d1-doors and adding them to the s1-doors. Also note there is no dynamic analogue of the s1-door condition (g) - cyclicity is not addressed at all. The d1-door condition (f) is more restrictive than the s1-door condition (f) since we have to enforce the one-way condition and there is no point in moving the O-box if there is already access to C and so the exit E.

Note further that d1-ways have the same definition as s1-ways with s1-doors replaced by d1-doors. Detection of open-occupied d1-doors (potentially with four complex checks for each box in the puzzle) could be a significant overhead we don't want at nodes in the RPR tree searches, so we might want to restrict it to the start or solved positions, and in computing free space if using it at all. Also, boxes determining dynamic access restrictions required for a d1-way could get moved during SOLVE without the program "knowing" - though that just makes things easier so perhaps it can be safely ignored.

In the earlier example there are two d1-doors, one in the room above the goal room top exit, and one in the room below the goal room bottom exit. They're marked as before, the box is on the O-square again. Note the lower case "e" is used to indicate an E-square containing a goal. Also note that there is a d1-way using the two s1-doors at top left and bottom left and the lower d1-door. This position is "zone cyclic" - by going through one-way doors the man can reach every zone and also return from any zone to his starting position.

More on s1-ways and free space

How can we detect and use s1-ways as an adjunct to free space?

First consider how to identify all s1-ways in the maze M. Loop through A-empty squares S in M not belonging to the Reverse exclusion zone (assuming we're aiming to use s1-ways in reverse mode from the solved position for clarity). Make the four possible checks for orientations of s1-door with C square at S and O square an adjacent pull square to S. That finds all s1-doors, from the definition. Add boxes at the O and E squares of all s1-doors and compute all man access corrals, and determine if there is a free space square in each corral. Applying condition (c) of s1-way definition we can determine if each man access corral connects adjacent s1-doors in an s1-way or not, and the order they come in. Putting together this info we find the s1-ways in M. We also can store the access squares trapped in corrals between succeeding s1-doors in the s1-ways.

Now suppose we have a solved position FP with given man corral and compute all man access corrals in P. In calculating free space within each corral, we normally look at A-empty squares S within merged pull access where addition of a box at S (making position P) isn't a detectable deadlock (ignoring the fact that there is one more box than goals) and, when computing all man access corrals within P, the total # of corrals in FP and P is the same.

To add s1-ways, apply the same conditions. So O-squares only get added if they are individually free squares in the usual sense. The change comes when we look for arrangements of many free squares in RFS. So we start with a position P1 with several free space squares occupied by boxes and are looking to add another. Firstly, the s1-door O squares are put at the front of the queue for new squares, so several of the free space boxes in P1 may be O-squares. We compute all man access corrals in P1 (MA(P1)). But we then compute 1-way access WA in the given man corral, using the s1-door "teleports" belonging to free space where they are applicable (meaning, where we can reversibly pull the box on the O square to the C square within man access starting at C). First we require no more 1-way man access corrals (relative to the given corral) than there were in FP. Second, from MA(P1) we move the man to each external corral EC such that the A-empty squares in EC belong to WA in turn. We then calculate 1-way access from there; the next condition is that this includes an A-empty square from the man corral (so we found our way back to the start).

Note that boxes on s1-door O-squares on the boundary between the given corral and an external corral may have to be treated differently. It's not enough to have one such to merge at the level of one-way access; a trip around s1-ways is needed to lead back to the given corral. Such a trip could involve s1-doors within and on the boundary of multiple corrals. So perhaps we should first calculate 1-way access, using all the detected s1-doors wherever they are, from each external corral adjacent to the given corral, seeing which lead back to the given corral to determine which corrals we can "1-way merge". Hopefully RPR(T) will result in boundary boxes being pulled to O-squares in free space (perhaps from adjacent corrals accessed by an sequence of s1-doors), setting up such mergers. So RPR(T) would need to check 1-way access from the end of its pull routes, in case these end in an external corral due to s1 "teleports", and in case the pull route destination is an s1-door O-square.

Future development and problems

The biggest weakness with RPR is probably the arrangements of free squares found by RFS. Better arrangements, packing more boxes into free space, may well exist than are found, and for more difficult puzzles they may be essential to the solution. I have no workable generally-applicable alternatives at present. The next problem is that RPR relies on free squares in the first place; more difficult puzzles may require navigable positions with multiple corrals. A step towards dealing with this extra complexity is to identify and use static one-way systems - those inherent in the arrangement of rooms in the maze, as discussed earlier.

This brings me to another problem with the RFS free arrangements: in the last stage, where circuits are removed, there is an awful lot of choice of ways to do this (much of which might be relatively unimportant in terms of the number of boxes packed into free space). However, some of the choices (due to convenient ordering) are likely to facilitate shuffling better than others, since the free space squares which cut down the circuits are introducing steadily more access restrictions. Again I have no workable alternatives at present, other than the obvious: restrict the number of candidates for free squares to one within straight tunnel sections.

Another way to introduce extra corrals is to generalize the notion of free squares to include "doors" for each extra corral. When checking a square for the free square list, usually we skip it if it increases the number of corrals. But we could check if it creates just one more corral and the square the box was pulled from initially still belongs to the man access, and we have access to the opposite pull square from the last one used in the pull route. We could allow only one such "door" per extra corral, or allow an unlimited number (perhaps setting up significant problems in later matching the reverse and forward search). The idea is that we block off an area by closing the "door", but it is possible to open the door by pulling in the opposite direction later to make the arrangement accessible again. Perhaps this idea should be combined with reversibility : we require the door box to be pullable back to the goal square it started from (at the time of setting up the door, and regardless of which pull square we use to return it beyond it being an accessible one).

A third way is to introduce some exhaustive searching for small rooms where we calculate all the different arrangements. We'd have to check the arrangements were achievable somehow, perhaps by successively peeling off "doors" until we had a single corral, and then put the "best" achievable arrangement (most boxes) into RPR.

A simple implementation of SOLVE

  • generate a maze free space arrangement, FA, for the forward search, incorporating static and (some) dynamic one-way door open-squares.
  • perform a tree search forward from the start position as deep as resources permit as follows:
    • expand single pushes of each box which isn't on a free space square from FA
    • expand macros where each box which isn't on a free space square from FA is sent on a single-box push-route to a single one of the empty free space squares of FA it can reach, chosen using a distance heuristic as furthest away (with a random selection as tie-breaker).
    • if the box has no push-route macro as above, look for a shuffling macro, as in RPR, with the initial target obtained after deleting the occupied FA boxes chosen using a distance heuristic as closest (to minimise the number of boxes to shuffle).
    • expand macros where each box which is on an open-square O of a one-way door in FA with the man having access to OP but not C has the push from O to C followed by the push from C back to A, with the man now in a different zone, forming the macro provided OP, CP and C are empty. Note that the box on O is returned to its original square by this macro, which is intended to promote use of one-way systems in FA. We don't insist that E is empty as one could go through the door to push a box on E out.
    • score nodes with a bonus for each occupied free space square.
    • give a score penalty for the number of zones above one in the position obtained by deleting boxes on FP open-squares of one-way doors. This heuristic approximates how far from a zone-cyclic position the node is. The current highest scoring node can be assessed more accurately for zone-cyclicity using one-way access.
    • retain a highest-scoring node, FN, chosen at random.
  • generate a maze free space arrangement, RA, for the reverse search (possibly one for each zone of the solved position).
  • perform a tree search in reverse from each solved position with a choice of man corral, which is the reverse analogue of the forward tree search above using RA for free space.
  • retain a highest-scoring node, RN, chosen at random from all these reverse searches.
  • perform a forward tree search (forward relocation) from FN as deep as resources permit as follows:
    • expand single pushes of each box whose square in RN is empty.
    • expand macros where each box whose square in RN is empty is sent on a single-box push-route to a single one of the empty squares whose square in RN is occupied, or is in RA, chosen using a distance heuristic as furthest away (with a random selection as tie-breaker).
    • expand one-way door macros from RA open-squares as indicated above.
  • if the search reaches RN, put together the variations (converting the reverse variations to forward ones) into a solution.
  • perform a reverse relocation search from RN to FN, using FA, the analogue to the forward relocation above.
  • if the search reaches FN, put together the variations (converting the reverse variations to forward ones) into a solution.

Complexity, tree size and branching factor

The above implementation is aimed at a limited search which explores deeper (using macros) than an exhaustive single-push tree search (e.g. depth-first search or DFS) whilst having no worse a branching factor than DFS and exploiting free space arrangements, not only as macro targets, but for their approach to zone-cylicity - which enables a relocation stage with simultaneous access to all empty squares, and hence all boxes. Each of the several searches is permitted to expend more time per node than a DFS, in line with the greater expense of macro single-box push-routes than single pushes, but is intended not to expend significantly more time on nodes than this implies, or to allow a situation where the search slows down, taking increasingly more time for deeper nodes. In short, it aims to be competitive computationally with DFS.

Here are some basic branching factor arguments to help justify this claim. First, DFS in Sokoban has a worst-case branching factor of 4N where N is the # of boxes, since each box can push in at most four directions. Thus macros need to be competitive here, which requires only one macro target per box - note that the final push of the macro can be in at most sixteen different directions, with a 16N worst-case branching factor (in the event that the target push-squares belong to different zones. It's hard to believe much of this worst-case 16 beyond 4 ever, or usually, gets realised or needs computing). Not moving boxes from targets (except during shuffling and one-way door macros) further reduces the branching of the macros and single pushes as the targets get occupied, which hopefully makes it competitive with DFS (rather than a 20N branching factor which would make the tree too large, if we dropped shuffling and let boxes on free space targets move) - also, the single-zone property of free space arrangements (discounting the one-way systems) tends to make the branching come down to N for macros (even faster than the average case brings it down to N) as all the push-squares of targets tend to belong to a single zone, though admittedly the average branching factor of macros also approaches the maximal one for the remaining boxes not on targets as the position approaches zone-cyclicity. The fact that shuffling macros are only invoked when direct push-route macros fail, thus are in general limited to more densely-packed arrangements, helps to offset the greater expense of shuffling over push-route macros, though of course this is not guaranteed. The furthest distance heuristic for push-route macros aims to help with packing of course, and the shortest distance one for shuffling aims to minimise the # of boxes in shuffling and so minimise the extra expense of shuffling over routes. In more densely-packed situations shuffling will tend to fail, thus mitigating the extra time expense by not leading to an expanded node.

A partly iterative implementation of SOLVE

To deal with puzzles containing one-time passages (for example), generalise zone-cyclic to "zone-cyclic up to stacked corrals". This means, the external corrals that are not part of the zone-cyclic region of the puzzle containing the man must be stacked: the boundary boxes of these corrals, plus any boxes inside the corral boundary, must all be on goals, and there can be no empty goals trapped inside the corral. In addition, we require the position obtained by converting these stacked boxes outside the zone-cyclic region to walls (treating them as push-frozen even if they aren't) not to be a flow deadlock (which has implications for boxes near but not on the boundary). This second condition allows for one-time passages: they get stacked one time and then are effectively frozen.

Obviously, a solved position is (trivially) zone-cyclic up to stacked corrals, just because every box is on a goal, and it can't be a deadlock as it is solved, however many or few stacked boxes are converted to walls. Consider a score function for the reverse free space search: score the # of goals, minus the # of push-frozen boxes on goals, plus the # of boxes on reverse free space targets which aren't push-frozen. For a classic puzzle like Original 29, all the boxes are push-frozen in a solved position, so the position would score zero. Now perform the reverse free space search, using free space macros, one-way macros, shuffling macros as above, as a DFS using this score function (limited by the maximal score when all boxes are on free space targets, assuming the obvious convention that free space arrangements are forbidden to contain push or pull-frozen boxes).

Whilst searching, check for any node being zone-cyclic up to stacked corrals (using the short-cut of deleting boxes on 1-door open squares to estimate the zone-cyclic region).

Skip nodes with a lower or equal score than the current highest (initially the solved position score), otherwise use 1-access to verify the actual size and cyclicity of the zone-cyclic region. If verified, store that node (the position and the variation to reach it) and set its score as the current highest. Upon finishing the search, if the current highest score exceeds the previous highest (in this case, the solved position score) throw away the search tree and iterate with a new reverse free space search, starting with the highest scoring node. Continue iterating until we reach the maximal score or the current highest score does not exceed the previous highest.

As an aside, one might wish to modify the score for the zone-cyclic up to stacked corrals nodes to be more helpful for complex puzzles by including in the freeze-stacking count the stacked external corrals outside the man's zone-cyclic region, even if not technically frozen, so the search will try to minimise these stacked external corrals: there is no guarantee the forward search can actually stack them without a subtle deadlock so this way the search would bias to positions closer to zone-cyclicity, as being more "safe".

For example with Original 29 (where we can work with zone-cyclicity rather than the stacked corrals generalisation) the reverse free space search, in theory, can reach a maximal score (up to a certain limit, see later), but from human solutions the depth in macro terms is estimated at 30 plus, so it might not be found in the first search. In my opinion, the exact reverse free space arrangement is not critical for this puzzle since there is just enough space for easy relocation and no significant flow restrictions beyond those of freeze-stacking dividing the puzzle into two main flow regions (not counting a smaller region along the lower wall of the goal room) after three boxes are freeze-stacked by the forward search in order to reach a zone-cyclic position (at 41 pushes, the depth in forward macro terms estimated at 15, provided we include macros from a box to a 1-door closed square when the open square can't be reached), so I would expect the iterative reverse free space search to reach the maximum score with three freeze-stacked boxes in two or three iterations. This demonstrates how, despite the huge conventional DFS search tree, an iterative approach could potentially solve this puzzle. Note that the relocation search can be made similarly iterative (providing it links two zone-cyclic up to stacked corrals positions) with an analogous (but more complex, two-tier) scoring system. The key is that from one zone-cyclic position one can hope to do some relocation, filling targets, and reach another zone-cyclic position with a higher score.

My contention is that the reason that this puzzle currently evades computer solvers is that it is in some ways "too easy": there are many solutions. For example, using macros and shuffling there are many ways for a reverse search to occupy the outer parts of the puzzle, setting up a specific zone-cyclic position (not to mention possible solutions without many zone-cyclic positions such as an optimiser might find) and also, due to the two large rooms, many different zone-cyclic positions. Throw in copious subtle deadlocks and the non-zone-cyclic positions and the conventional BFS tree is huge (made worse by pointless traversals of fragments of the one-way systems). The iterative search cuts this down by imposing a specific zone-cyclic position (technically the maze reverse free space arrangement may have more targets than goals, but we can still specialise to a goal-matching reverse free space arrangement using a two-tier system of free space macros, like we do for forward one-way macros, without increasing branch factor at the cost of spending more time on computing macros). This limits the amount of search space given up to alternatives as the macros zero in on the specific zone-cyclic position: also each successive iteration throws away irrelevant parts of the search to focus on the current highest scoring node. A similar argument applies to the relocation iterative search. Having one-way macros (with the condition about not moving boxes from free space targets) helps navigate the one-way systems more efficiently.

A reasonable objection is that many of the XSokoban 90 puzzles are larger, with more goals and larger rooms to rearrange boxes, and many of these are easily solved by current computer solvers. Shouldn't these have a larger tree than Original 29? The answer is: yes, they have larger BFS trees, but solvers use DFS with a lower bound. Such of these large puzzles that are easily computer solved are so because the DFS variations to open up the puzzle enough to do significant stacking without deadlock aren't very deep, and the packing order problem is easily computer solved to guide stacking to avoid deadlock. Stacking usually significantly reduces the lower bound, so DFS takes an early chance to go deep, and by stacking many boxes close together opens up a lot of space which lets external corrals be easily opened and the remaining boxes stacked. This is where Original 29 is "hard": though a little freeze-stacking is possible immediately and the packing problem isn't hard, further stacking is prohibited by deadlocks, some of which may be undetected as they involve the two large rooms, which can't be easily circumvented. The puzzle (apparently) has to be opened up to be zone-cyclic to circumvent deadlocks due to too much stacking making some external corrals deadlocked. As above, a variation I used to do this is 41 pushes deep. This variation substantially increases the lower bound in places (a complex shuffling manoeuvre, making essential use of the one-way system on the right, seems unavoidable), so the combination of depth and high bound makes it difficult for conventional DFS: high bound makes the DFS stray too far "sideways" into the large BFS and high depth makes it stray too deep into the BFS. Even then, the difficulties aren't over as some relocation (again increasing the bound) is also required before stacking can resume without deadlock. These considerations only apply to one human solution I found, but if there was a substantial shortcut (not revealed by optimising) then computer solvers would probably have found it.

The catch for Original 29, and many other puzzles, is that these potential advantages of the iterative parts of the SOLVE method depend entirely on the forward free space search, which is NOT iterative, successfully finding a zone-cyclic (up to stacked corrals, in general) position in one search. Even at macro depth 15 this is quite a task - one I'd like to see an attempt at. I have some speculative ideas about iterative forward search (using estimates of the likelihood of external corrals being deadlocked) but no definite method as yet. It seems difficult to further weaken and generalise the zone-cyclic concept, to arbitrary external corrals whose shape and identity change dramatically as the man choreographs his way around and pushes through them in his complex dance, in meandering cycles.

More generally, in slightly more difficult puzzles it may not be good enough to choose an arbitrary free space arrangement as the access restrictions may be such that only a very special free space arrangement (or several, at different stages) will either make a zone-cyclic up to stacked corrals position achievable or the relocation phase achievable. This points at very subtle deadlocks (or restrictions), whereby with a particular arrangement of boxes (even if zone-cyclic) not all the rearrangements within the limits of flow deadlocks are possible (or easy) that one might imagine.

Note that shuffling macros can be extended in the presence of a one-way system, to allow for shuffling "through" (known) 1-doors. The point is that, after computing a push route with free space boxes deleted and detecting boxes on free space targets which intersect the route, the man might not have access to all these boxes because they are separated by 1-door open boxes. Providing none of the shuffling boxes are 1-door open boxes, we can use the heuristic of deleting the O-boxes to regain this access and compute the shuffling routes. Then we can compute 1-access and add in one-way macros to make the boxes accessible. For Original 29, at one point in my solution a box on the exit of an occupied s1-door has to be pushed out, which requires such a shuffle "through" the s1-door. Assuming this particular shuffle is detected actually would reduce the macro depth of the 41-push zone-cyclic position to just 9, but I felt there was no guarantee of it being found and considered it as separate one-box macros in estimating the macro depth. Also, of course shuffling through doors requires computing 1-access which can be expensive.

Note that some puzzles involve forward (or reverse) shuffling of the occupying boxes of a part of or an entire one-way system: the boxes move to (and from) the closed squares, travelling in the opposite direction to that of the one-way system. Such puzzles can have enormously lengthy solutions, as the # of shuffles can even exceed the # of 1-doors (I made such a puzzle recently but the theme is well-known). It would be amazing to iteratively solve one of these puzzles with a vast one-way system using generalised shuffling!

Goal-matching free space

It is obviously possible for a (maze) free space arrangement (FSA) to have fewer, more or the same # of targets as there are goals in the puzzle. Consider the case when there are more (target excess). An example puzzle is Original 50. There is so much space outside the goal area that even the simplest FSA will likely have many more targets than goals.

With targets chosen for decreasing accessibility, from isolated squares down to those blocking circuits, with target excess we can choose the most accessible targets to get a restricted FSA whose # of targets equals the # of goals (goal-matching). In a search using the free space target excess, we can prioritise macros using goal-matching targets, with macros to the remaining excess targets only applied if there is no successful goal-matching macro or shuffle.

Consider the following position from Original 50 occurring in a human solution:

      ############ 
     ##..$ $ #   # 
    ##...      $ # 
   ##....# # # $## 
   #....# # #  $ # 
####...#  # $ $# # 
#  ## #       $  # 
#      ###  # # ## 
#       $ # #   #  
###     # # # # #  
  #  $    # # #####
  #  # #####   $  #
  #  $ #   # @ #$ #
  #$ ###   ##$ $  #
  #  #      #    ##
  ####      ###### 

The position has a single zone, with many boxes relocated to the right. Stacking is now possible starting with the four boxes on the left, working our way from left to right of the goals. From all the empty space on the left it's easy to imagine goal-excess FSAs.

Consider the area in the lower right containing the man and four boxes, with only one external exit. It is possible we could have pushed in one more box to where the man is. We'd still have a single zone, but this position is deadlocked since we can now not push the five boxes out of this area. As you can see, the boxes are tightly packed on the right side of the puzzle, as is necessary to empty the left enough to do the one-sided stacking. So there is a problem with trying to get enough free space targets to solve the puzzle without creating a deadlock by packing too tightly.

Here choosing a more-accessible goal-matching FSA and using priority will help for the forward search as the goal-matching targets will then be spread out (clustered around initially isolated boxes all around the puzzle).

The more difficult problem here is the reverse FSA, packing enough targets on the right to pull out all the boxes starting with the man in the right-hand zone.

Note that, given a reverse FSA, we can check its effectiveness more quickly than by running a full reverse search by running a packing order search. Packing order searches run more quickly as they delete boxes after they are pulled to targets. Using the same system of prioritising targets, we can refine our goal-matching reverse FSA to that occurring in the packing search. This may help narrow the full reverse search to focus on a goal-matching reverse FSA which is more likely to occur.

I haven't addressed what to do if there are fewer FSA targets than goals, or considered goal push-frozen box arrangements within FSAs (such as are needed in Original 29), or how we might try to find closely packed FSAs such as are needed for Original 50. These are some of the new bottlenecks for this search method.

Packing maze free space

Let's consider a simple algorithm aimed at producing a closely-packed maze FSA (to avoid there being fewer targets than goals if possible).

Our first rule is that an FSA may contain neither pull-frozen nor push-frozen box patterns (FPs) (to avoid easily-avoided rigidity of the FSA which needs to be mobile for the relocation phase to work).

We build up the arrangement from the maze one box at a time. If 1-doors have been detected, we add their open squares one by one as candidates for the FSA. If adding a candidate creates an FP, we don't admit the candidate to the FSA and add it to a hash table of excluded squares - which also includes squares blocking one-way macros (the C, CP and OP squares for each added 1-door). Otherwise, if a candidate isn't excluded we add it to the FSA. Once the open squares are added, we count the # of zones Z in the FSA. We can then add the requirement that adding a candidate to the FSA doesn't give a position with more than Z zones (to retain zone-cyclicity where possible), or we consider it excluded.

The second rule, given a square already in the FSA, is to recursively add candidates which are two squares up, down, left or right of that square.

The third rule, when the second has been exhausted, is to add candidates which are one square up, down, left or right of FSA squares. We then apply the second rule recursively to each such square that gets successfully added to the FSA.

Our fourth rule, to apply when the previous two rules are exhausted, is to add candidate squares one square away diagonally from FSA squares. Once more we apply the second and third rules to successfully added squares.

Our fifth rule is simply to add remaining empty squares as candidates.

Note that we can re-order such an FSA according to accessibility by building it up from isolated squares, and then ones not decreasing the # of circuits, etc, as in the RFS algorithm, to find a goal-matching accessible FSA within a goal-excess FSA.

It's still possible that some goal-matching FSA within such a packed maze goal-excess FSA will be found by search and not be mobile enough for the relocation phase to work, so ideally the FSA should be subject to mobility tests (such as emptying small rooms or areas containing FSA squares).

Machine-learning

Obviously SOLVE is limited in effectiveness by the intelligence (or otherwise!) used in selecting macro targets and applying macros. Also, a side-effect of the furthest-distance heuristic is that boxes may be pushed all the way from a source to a sink in the flow digraph, which can make relocation problems hard to solve. In the distant future maybe one could envisage an iterative procedure whereby meta-analysis of the previous SOLVE searches (which you might consider to be learning or high-level planning) leads to successive search iterations dealing more effectively with flow and relocation bottlenecks. Information concerning maximal and minimal numbers of boxes within areas like flow regions during searches could be utilised in this regard. Meta-analysis could also help with those puzzles whose solutions don't conveniently split into the three stages in the order of the SOLVE search, so that the solving method would have some kind of meta-analysis analogue of the flow region digraph, whereby individual subtasks like opening up, relocating, or even closing up, are applied to different areas of the puzzle and there are order restrictions between subtasks and areas.

A possible way to improve free space arrangements is via "deformation" of corrals. Consider boxes on the boundary of the given man corral with a push square in a neighbouring external corral. If pushing such a box away from the man corral is possible without increasing the # of corrals (or a detectable deadlock) we can do this to open up more space in the man corral. We can do this for all suitable boxes on the boundary. Of course, this extra space comes at the cost of restricting free space in neighbouring corrals. Suppose we've applied RPRT in the forward direction and only have partial success in occupying free space and reducing external corrals. Wherever we had the most success in occupying free space, we try deforming outward (pushing) those corral(s) at the cost of neighbours with less occupied free space. I don't have a detailed algorithm but envisage adding deformation pushes as an intermediate between push routes to free squares and exhaustive search. Hopefully we can add to free space in the successful corral(s) and again hopefully that will increase the occupied free space.

Another problem with the the RPR(T) free space is it is calculated at one time for a single position (any solved position in reverse and SP in forward). As variations are applied occupying some of these a priori free space squares, the position will change dramatically with respect to the SBL boxes, some of which may have moved by a sequence of single pulls. This may open up new dynamic free space squares not calculated a priori. For example, in a classic puzzle with many goals packed tightly into a single goal area, pulling out a number of boxes might open previously occupied space in the goal area, which is unnoticed by RPR(T), but might be needed for a solution. We could simply tack on a new free space calculation at the end of RPR(T) and run it again at the end of the previous variation to utilise the goal area space. That may work in some cases but doesn't address the issue of dynamic free space squares opening up fleetingly at any stage of the reverse variation, which may need to be used in more complex puzzles. Trying to deal with this in general by adding all dynamic free space squares in with separate subtasks risks subtasks with partial success multiplying without limit and defeating the purpose of limited search.

Note that the search is in the following fixed sequence (sketch of algorithm SOLVE): (a) open up to single corral with both forward and reverse variations. (b) apply reverse stacking order to find dynamic reversible region box # restrictions and relocate to match #s. (c) relocate to match forward and reverse variations into a solution. Obviously not all puzzles will divide up into this sequence neatly. Apart from one-way systems requiring external corrals, in practice the puzzle may require partial opening up, relocation, closing up, stacking and other variations in arbitrary sequence. This risks subtasks with partial success multiplying without limit and defeating the purpose of limited search.

Another problem is unmatchable forward and reverse variations (which may be related to more subtle dynamic restrictions than I have described). For example, as a human solver and setter I've often found forward and reverse variations which match exactly, except either the man is in the wrong corral or just one box is displaced one square, but requires a pull rather than a push in forward mode to match. Sometimes a little more work to the end of either forward or reverse variation produces a solution but often a completely new approach is necessary, suggesting the forward and reverse variations may be "unmatchable". Perhaps some form of machine learning is required to deal with more difficult puzzles, or perhaps progress can be made with a relatively simple parallel search of multiple subtasks with some connecting heuristics.

Personal tools