User:Puzl bustr

From Sokoban Wiki

Revision as of 18:11, 25 March 2010 by Puzl bustr (Talk | contribs)
Jump to: navigation, search

I'm a Sokoban enthusiast - I like solving and creating puzzles. In the past I've helped program Sokoban games and a generator (backward solver), and maintained a website with my puzzles for download, but due to getting Repetitive Strain Injury (RSI) from typing am unable to do these things involving frequent typing. I've an adapted mouse and am able to write this with an onscreen keyboard. Am able to solve and recently started designing more puzzles.

I'm interested in computer Sokoban solving in terms of artificial intelligence. I believe that, using domain-specific knowledge and a combination of limited searches using heuristics (for example, identifying free space, empty squares which you could imagine adding a box to without creating an obvious deadlock or a new blocked-in area and variations to reach it via single-box push routes) and conventional tree search, it is possible to substantially improve on conventional computer solvers.

Since I cannot program my ideas, I'd like to make them available here for others to implement; I'd like to be credited with my name David Holland and for these ideas to be freely available (with any implementation open-source). I'll try to update these ideas as I can and include descriptions of algorithms (as I've already done for the main limited search identifying free space and computing single-box push routes and "shuffling") but as I have to rest my hands frequently would prefer "able-handed" programmers to do the coding. Several of the algorithms can also be used in conventional Sokoban solvers.

Section 3 contains the algorithms and the most precise descriptions. Sorry about the length of this page.


Section 1. Definitions and ideas

====================

The kind of solution


Here I am only concerned with solving a Sokoban puzzle via computer in any number of moves and pushes, not with optimising the solution (which can in any case be applied using an optimiser once a solution has been obtained). For simplicity, I neglect the number of "man" (or "pusher" for forward and "puller" for reverse) moves used, only considering the pushes of boxes, so I only care about the empty squares accessible to the man, rather than the particular squares he moves from in between pushing boxes. I also don't too much care about how many pushes are used in a solution: the greater concern is restricting the search tree by domain-specific knowledge so that a solution can be found at all in a reasonable amount of time and memory. This presents a departure from standard search techniques using an estimate of the number of pushes left to solve the puzzle. I prefer to break down the problem of solving a Sokoban puzzle into manageable "chunks" (subtasks): I would then only care about the total number of positions (pushes plus man access information) in the search tree for the individual chunk, plus of course the total number of chunks. Where possible it would be nice to be able to pre-assign the number of chunks to break the solution into before starting the search for individual chunks (which can be done to an extent with stacking ordering as we will see in the next three sections) so I can be aware of how many chunks I'm using; but in practice the chunks will have to be dynamically assigned as the search progresses. Thus, the search for a solution would fail if the tree for an individual chunk became too large or there were too many chunks (so the aggregate total of trees for all the chunks exceeded some limit) or too much time was used in the search. In a nutshell, I'm interested in restricted search, planning and heuristics.

Corral


A boxed-in area. There is always one corral or area containing the man, (the "man's corral", which may only be boxed in by walls) but there may be other corrals the man is outside (known as "external corrals").

Man access, push squares and push access


The man can move (without pushing boxes) to any empty square connected through adjacent squares horizontally or vertically (but not diagonally), which defines the "man access" within the man's corral. Conversely, if the man is just pushing a single box, there may be many similarly connected empty squares (with the direction of the push also included, so that the same square can occur more than once with the man pushing in different directions non-trivially if at some point in the box's journey it is on the boundary of an external corral). This collection of squares combined with push directions constitutes the "push access" for that box (the algorithm to compute it for a given push square (see below) of the box is well-known and I won't describe it here). But, it may well be that not all the man access squares even within the man's corral constitute positions the box can reach with the man pushing only that one box; walls or other boxes may get in the way, and prevent a push (either by physically blocking it or if the push leads to deadlock, see below) or require a change of direction, and due to external corrals the man may not have access when a change of direction is required. As a familiar example, if two rooms are joined by a bent tunnel one square wide, the man may travel along it but the box cannot get past one of the corners. As another example, in a square room with only one exit three squares wide at the exit, a box in the room on the wall near the exit can't be pushed inside the room by the man to the centre of the room; it first has to be pushed out of the room, and only if the man has access can it then be returned to the room where it can be pushed to the centre. Pushing the box inside the room to the exit creates a new external corral.

The four squares adjacent to a box horizontally or vertically are called the "push squares" of that box. If empty and accessible by the man, and if the opposite corresponding square is also empty, the box can of course be pushed away from the "push square" in a well-defined direction, hence the terminology. Of course, there is no guarantee that doing so would be a useful move. It might lead to an insoluble position, as in the next definition.

Deadlock


A position whereby some box can never reach a goal or some goal can not be reached by any box (or more generally from which it is impossible to solve the puzzle) is known as a deadlock. The simplest deadlocks are created by pushing a box onto a "dead" square, one from which it cannot be pushed to any goal (a precise definition and algorithm given in Section 3). For example, the corner square of a room if it does not contain a goal is a dead square. Another simple example is where a box is pushed along a wall until it rests again another box on the same wall (with at least one of the boxes not on a goal). The two boxes are then "frozen" in that they can never be pushed anywhere else. The position is a deadlock since at least one of the two boxes is not on a goal and is frozen, so the puzzle cannot be solved. There are many generalisations of these examples; some corral deadlocks can be detected by a computer if an external corral is created in which an exhaustive search of the bounding boxes is unable to open up the corral (giving access to push those boxes in the corral which are not on goals). Whatever set of deadlock positions the computer program can detect are termed the "detectable deadlocks".

Open and boundary boxes


I propose an artifical subdivision of all the boxes within a given corral into those on the boundary, which have an empty push square not belonging to the given corral, and those in the open, which are the remainder. I envisage subtasks in which exhaustive search is performed on the boundary boxes but limited ("good enough") searches are performed on the open boxes. This may implicitly restrict the program to solving puzzles in which the subtasks can be subdivided in this way. There may also be ways of getting around this rigid subdivision using various heuristics, not to be described in this section. The definition may also need to be refined to take account of boxes adjacent to those on the boundary; are these to be considered boundary or open boxes, or a further category (for the purposes of the computer search)? This is similar to the situation in which restricted areas of the position, namely external corrals, are searched to check for detectable deadlocks in existing programs.


Stacking


Placing a box onto a goal is known as "stacking" the box, or just "stacking". This terminology is appropriate when the man is thought of as a warehouse-keeper moving crates around a warehouse to their correct destinations.

A "stacking move" for a box is a sequence of pushes of the same box which leads to that box being stacked (on a goal).


Free space


Intuitively, "free space" is empty squares that could be used to contain boxes but aren't being used at the moment. An experienced human solver immediately notes where corrals contain free space, and keeps it in mind when trying to solve some task which runs into trouble with deadlocks: maybe rearranging a few boxes first using the free space to pack them more tightly (but safely, without a deadlock) in one area will free up space elsewhere so that the original task (such as stacking a box, combining adjacent corrals or moving through a one-way system) can be achieved without deadlock.

Algorithmically, within each corral the empty squares that are not "dead" can be looped through, with an extra box being added (and then removed). If adding the box does not increase the total number of corrals, nor create a suitably restricted notion of detectable deadlock, (one not concerned with the fact that we now have more boxes than goals!) that is designated as a free square. It is then straightforward to decide whether or not a corral contains at least one free space. One can generalise by iterating the process once one free square is identified: however of course this would give only a lower bound on the free space assuming that an exhaustive search is not attempted. It is envisioned that exhaustive searches of the number of free squares to any depth in any corral are not attempted, due to time and memory overheads. It is heuristically more important to give a lower bound, obtained with a particular set of squares, perhaps chosen by some heuristic function not described in this section, than to attempt exhaustive search to get a precise limit and ideal arrangement. Obviously, this will depend on the overall size of the empty space within the corral; for particularly small corrals an exhaustive search may be used. The reason that mere lower bounds can be useful is that one can formulate subtasks (such as moving a boundary box onto a free space square) where it may not signify which particular square is used (or more precisely, there are perhaps several squares which could be used to similar effect, and it is "good enough" to find one of them). The general idea is that for some tasks it is "good enough" to find a particular solution, rather than all solutions. In practice, one may be limiting solving ability of the computer program to those problems where "good enough" solutions can be found for the subtasks.

In particular, in a corral with at least one free space square, one subtask could be to make your way to the neighbouring corral and push a boundary box onto a free space square, thereby combining two corrals into one corral and in principle simplifying the position. Any of the squares identified as free space for adding a single box will be good enough for this task (potentially; in practice some may lead to subtle deadlocks). The idea of making your way to a different corral than the man's brings up the next definition.

Navigable position


The simplest example of a navigable position is one where the man has access to all empty squares, so there are no corrals the man is outside. We say the position has only one corral (the man's). In principle the man can then move all the boxes which aren't frozen on goals. In practice, the man may need to move some other boxes out of the way first if they are blocking pushes of boxes, and some box pushes may of course be forbidden as they lead to (detectable) deadlock, or worse, lead to currently undetected deadlock.

The more significant generalisation of this is to positions with multiple corrals. To be navigable, a position must come with variations ("navigation variations") allowing the man to traverse around all of the corrals. In a simple puzzle, it might be possible to combine the man's corral with a neighbouring corral by moving some boxes, and iterate the process until only the man's corral is left and we are reduced to the simplest case of a navigable position. Naturally it is only useful to do this without creating a deadlock. One always has to bear in mind that deadlocks could arise that are undetectable by the program at the time (though which might be detected at a later stage). Some consideration should be given to backtracking when such belated deadlocks are detected, but I'm not addressing that problem in this section.

The next simplest case of a navigable position would be where the variations to traverse the corrals utilise "one-way systems", whereby the man opens up a corral by moving a box, or more generally several boxes, moves through to the new area opened up, possibly closing up access to some part of the original corral, and repeats the process travelling successively through several adjacent corrals, until finally the man returns to the original corral. Ideally, a one-way system would return all the boxes to their original positions and move only one boundary box to traverse from one corral to the next (opening then closing the "doors") at each stage, so that the effect is that of the man "jumping" over boxes into neighbouring corrals in a loop. The more difficult and particularly the remaining unsolved XSokoban puzzles tend to use these kinds of one-way systems, which in combination with "free space" (see below) are a significant factor in increasing the size of the global search tree and introducing many subtle deadlocks with a conventional search.

Having a lot of free space without multiple corrals may lead to a very easy puzzle (though not necessarily if the rooms are carefully designed) but having a lot of free space with multiple one-way systems can actually lead to a difficult puzzle (in terms of computer solving at least) because the access problems created by the one-way systems along with the capacity to rearrange boxes in many arrangements in the free space, some of which may generate subtle undetectable deadlocks, means that there can be multiple solutions, which creates a problem for an exhaustive search, but also enormous areas of the tree with positions with similar features which may contain a subtle deadlock. In practice, as an experienced (human!) solver I would say that having several one-way systems, combined with numerous box-moving restrictions such as room entrances along a wall which don't allow push access of boxes to the whole room, and bent tunnels which only allow man access without allowing boxes push access, but allowing a certain amount of free space, can force a solution where several boxes may have to be painstakingly moved through many rooms a little at a time, all the time requiring one-way-systems to be navigated and access problems to be negotiated. In combination with free space and clever design, it may be required to shift several boxes from one room to another, in one arrangement, and then back again or to another room in another arrangement, before the stacking order requirements of the solution can be satisfied. Though the solution may not be unique, even an optimal solution may require a lot of this lengthy rearranging in stages, and the solutions may be well hidden in a forest of near-misses with subtle deadlocks. To my mind such problems almost demand a more sophisticated computer search where the problem is broken into manageable "chunks".

The basic plan of the unconventional search method proposed here is to trawl the global search for variations moving the man around one-way systems; this then allows a kind of pruning, by restricting the search to navigable positions. One can of course further break down the search for navigation variations to those allowing the man to traverse through neighbouring corrals. The intention is to split up the global search, whose tree may be very large, into a number of smaller more manageable searches for navigation variations, each of which may have a smaller tree. In more detail, exhaustive searches moving boxes on the boundaries of corrals may be combined with more limited (not exhaustive) searches moving boxes into free space. There may be many ways to pack boxes into free space to free up boxes elsewhere; it is not necessary to find all of them, only one that is "good enough" in that it creates no currently-detectable deadlock and solves the subtask required. The exhaustive global search is to be replaced with a number of more limited searches, some exhaustive but limited to boundary boxes and others limited (not exhaustive) and applying to (mainly) the man's corral open boxes.

Access circuits


Strictly speaking, there may be a simpler navigable position than the general one with only one corral. An arrangement of boxes and walls may be disconnected from the outside walls, allowing the man to roam all around its perimeter. Such an arrangement is an "(access) circuit"; the number of separate circuits is a topological invariant of a position (in particular, it is uniquely defined however and in whatever order you choose to count them). A circuit can of course also appear as part of a position with more than one corral. Circuits are particularly helpful for solving, in that a box can be temporarily stored at any reachable position on the boundary whilst not increasing the total number of corrals (though perhaps decreasing the number of circuits by one or more). Another advantage of circuits is that (speaking roughly) the man can push a box around near the circuit, and every time he has to change direction the circuit provides automatic access to the new push square, so it becomes relatively easy to store a box at almost any position on the boundary of the circuit. In other areas of the puzzle, without circuits or with multiple corrals, loss of access may occur when the direction to push the box in changes. Thus, it can be helpful to use a circuit to store a box that is blocking some other part of the puzzle, to help increase the amount of access at box changes of direction in a part of the puzzle with more limited access. This is another case where "good enough" variations may be found without searching through all possible variations. Naturally, having enough free space to generate a circuit can lead to a large conventional search tree, but in combination with multiple corrals, one-way-systems and clever room design can still lead to very difficult puzzles for a conventional computer search. A precise definition of circuits and an algorithm to compute them is given in Section 3.

Shuffling


This is a generalisation of a "stacking move" to navigable positions. It also generalises stacking by not necessarily putting a box on a goal, merely onto a square previously identified as helpful (such as matching a reverse search position, or a free space square).

If you can't actually put a box on a goal just moving that one box, (i.e. the goal is not in the push access of the box) then maybe you can utilise the navigability variations to move the box part way to the goal. For example, select a goal which is not covered by a box. Since the position is navigable, there is a variation moving the man to at least one corral neighbouring the corral containing this goal, and a variation where the man opens up the goal corral. One might try simply pushing a boundary box onto the goal once we reach the previous corral, which is the simplest generalisation of a stacking move.

However that might create a (potential or detectable) deadlock by messing up navigability, or simply be impossible as there is no stacking move available. Instead we use the directionality of the navigation variations: move the man into the goal corral, then push a (preferrably open) box in the goal corral (if necessary) to clear up a space for the boundary box we want to stack. We then traverse the one-way system as before, pushing open boxes back into the space freed up by the last box moved, until we arrive in the corral containing the box we want to stack. Now there is a space for it to be moved into the goal corral freed up by the first box moved, without messing up the navigability (since we restricted ourselves to open boxes). It's a kind of game of musical chairs. Essentially you are vacating space by moving boxes "backwards" around the one-way system as you move, which still allows you to move the man forwards around the system. You can compare this with the procedure in a block-shifting puzzle like the 15 puzzle, whereby many blocks are moved to transfer the single empty square around the puzzle to where it is needed. It may take several stages for a box to reach a goal this way. I'm trying to give a common theme but in general the details of shuffling may be complex and require breaking into several subtasks.

This procedure may also be used when there is no one-way system, merely a corral access or circuit, where the variations necessary will generally be much simpler than in the situation with a one-way-system (where shuffling subtasks are by no means guaranteed to be possible). This is the original situation where the "shuffling" concept came into my mind. Even then there may be some restrictions, and if there is enough space to create more than one circuit that could make the shuffling process even simpler; as more access is granted, the position moves in the direction of one ideal for shuffling, whereby each shuffling move of a box can be achieved by pushing that single box alone. In the context of stacking boxes, this corresponds to the simplest possible type of puzzle whereby the puzzle can be solved with consecutive stacking moves. A very simple, fast shuffling algorithm (for single corral access and involving only single-box routes between fixed squares) is given in Section 3.

Shuffling (in the simpler case where stacking boxes is involved) is a technique for converting a navigable position into another navigable position with one more box on a goal (stacked). In a real sense you don't need a pre-computed goal area stacking order; you just look for shuffling variations which stack one box at a time (or in fact move a box a bit closer to being stacked in general). If you find that some previous stacking produces a deadlock, where some box can now not be stacked on any goal from its current position, this means backtracking and adding an order-related subtask for the box: it has to get moved from its current position to one where it will be able to stack somewhere after the stacking moves that produced the previous deadlock are performed. Example: a box on a wall near an entrance to a goal area which is man access only, the box can't be pushed in. To remove the box from the wall (due to the pattern of the rooms) the man has to use the goal area access, but the deadlock arises when boxes are stacked which block off this access. The solution is to move the box off the wall using the goal area access before blocking the access with stacking moves.

Obviously this approach can run into trouble with a proliferation of subtasks, especially if a lot of backtracking is necessary. Heuristics probably need to be used relating to when we give up trying to execute subtasks which are running into difficulty with a given position, and when we return to the global search to look for a "better" position. But in general, this idea that it doesn't much matter which box you stack so long as the position remains navigable is powerful, as it allows a "good enough" criterion. If, having tried to stack one box by shuffling all of the boundary boxes (or enough to be successful) you manage to stack one, that is "good enough" for it to be worthwhile sticking with this variation. Note that one would probably first try to stack the open boxes in each corral which contains goals, which would itself potentially free up space to stack boundary boxes or merge corrals. There is still the basic difficulty of an n factorial, where n is the number of goals, number of potential ways of stacking boxes, for each navigable position, so it would be helpful to try to apply goal area stacking algorithms, and generalisations to intermediate positions, to reduce this number, all the same!

Reverse search


The aim of a reverse search in this approach is to re-distribute the boxes, from a given solved position with the man in some corral, into a navigable position with the boxes "near" to a navigable position identified by the forward search (the current best candidate in terms of having stacked as many boxes safely as we can manage and avoided any stacking-order problems identified in the search). This involves identification of free space. Within each corral, we identify free space, trying to match up the number of forward box positions within each room which intersects with the corral. Since this process is not exhaustive, merely "good enough", the match may only be approximate rather than exact. We may have too few boxes in one room and too many in another, and the forward and reverse box positions may not match within a room. Since we can assign statistics of how many boxes are in the solved position in each room, we can simply note whether there is any improvement in matching up the numbers and positions (possibly with an ordering in which the number is most important, that would depend on the heuristic and algorithm details chosen). We can probably usefully subdivide the reverse search into a subtask searching for a navigable position, with the same or fewer corrals than the best position in the forward search, and then one trying to match up with the forward search.

Also note that we would have to keep track of information about the number of boxes frozen on goals in the forward search which limits which boxes may be moved in the reverse search.

The idea is, if there is improvement in the matching, the reverse variation leading to that situation is "good enough". We can then follow up with further subtasks, trying to redistribute the boxes so that more are stacked (by shuffling or open box stacking) and so that more are re-distributed (by shuffling) between rooms to match numbers and positions. Once could assume these subtasks are applied to the forward search, since it is equivalent to try to match the forward search with the reverse search or vice versa. This potentially leads to a great improvement in terms of the size of the search tree with a conventional search which just explores the forward and reverse trees as far as it can, hoping for an exact match between the two. The latter is an all-or-nothing approach, you either solve it or you don't, whereas this approach is more incremental: the search can post forward and reverse positions and navigation variations and indicate how much progress has been made towards a solution. This also suggests the possibility of applying different heuristics to try to make as much progress as possible for a given puzzle wherever it is easiest to do so. A simple, fast algorithm to spread out boxes in accessible positions from a solved position is given in Section 3.

Disclaimer


I am not suggesting that this approach can solve arbitrarily large and complex Sokoban puzzles (which is probably impossible in general as Sokoban is PSPACE-complete); merely that it has the potential to vastly reduce the size of the total search tree in some medium and large puzzles, demonstrate (potentially!) meaningful progress towards a solution in a puzzle, and lead hopefully to complete solutions for all the XSokoban puzzles and many more.

Section 2. Heuristics

=========

The maze


Given a Sokoban position, the "maze" is defined as the walls and rooms of the position (where all the boxes and the man are removed); thus it is uniquely determined by the puzzle and does not depend on the variation (the pushing of the boxes).

Box removal


As has been noted before, a Sokoban puzzle may become trivial if one box is removed, without giving any clue as to how to solve the original position. However, certain kinds of dependencies may be specified by looking at positions with boxes removed. This technique can be useful in many heuristic methods.

Here is one example.

Boundary box removal


In the previous section I discussed corrals, one-way-systems and circuits (in increasing order of accessibility). In a general position perhaps with multiple corrals, one could try removing boxes in an attempt to reduce the number of corrals, convert one-way systems to circuits, and increase the number of circuits. As with free space, it may be "good enough" to find one way to do this without finding all possible ways, and thus to find lower bounds. Of course, it is normally a trivial matter to reduce a position with n corrals to one with a single corral by removing n-1 boundary boxes, and it might not seem that this generates any useful information. But if one takes into account neighbours of boundary boxes, it may be necessary in some areas of a corral to remove more than one box to reduce the number of corrals, so there could still be useful information to be gained: here one would first look for boundary boxes with empty push-squares in two (or more) different corrals. When considering a subtask such as trying to find a variation which merges two adjacent corrals, it is meaningful to have a numerical estimate of the minimum number of ""essential" boxes which has to be relocated to do this, especially when making use of previous estimates of the number of free spaces in each corral. Of course, an actual variation to merge two adjacent corrals might move a lot more than the (more precisely, a) "essential" boundary box separating the two corrals. It also might change the shape and identity of the two corrals substantially, affecting other corrals. Even so, it might still be helpful in a search broken into subtasks to match up such numbers to help identify subtasks which are potentially easily performed. If a corral and its neighbour both have at least one free space, and there are boundary boxes with empty push squares in both corrals, the potential to merge the two corrals (in either direction) is there, and it is more likely to be possible, all things being equal, than if neither corral has free space. Thus, the box-removal data can potentially be useful in ordering subtasks based on estimated likelihood of success.

Reverse search box removal


Ideally, from a given solved position (with given man access) we'd like to pull out the boxes and distribute them evenly all over the position so that there is only one corral and every box is potentially pushable (so, not still frozen on a goal). It would be even better if there were multiple circuits, so the likelihood of the puzzle becoming trivial (assuming some random sprinkling of start positions for the boxes also with only one corral) is relatively large.

To expand on the previous section's discussion of free space, we can make estimates of the free space in the maze and within the given solved position. We can also consider matching the number of free spaces in each room with the number of start squares in that room, so that we are aiming to produce a position, in reverse mode, which is "near" to the start position of the puzzle in the approximate sense in that there are the same number of boxes in each room.

Naturally, in the solved position or the start position or intermediate positions, there may be multiple corrals, and no circuits, so this ideal situation may not arise. But it could still be useful to make an estimate of how far away from this ideal situation we are in the following sense.

First try pulling a single box at a time; it may be that we can reach one of the distributed free space squares. If there are multiple different rooms in the maze we might consider it a success to pull the box away from a packed goal room and into an empty room, thereby spreading out the boxes more than they are in the solved position. More generally, we'd like to match up more closely the numbers of start squares all over the maze with the numbers for the current corral, and the number of potentially movable (pullable or pushable) boxes.

To get to a single corral, we could try removing (as few boxes as possible) from the start position; we could also compare with the data based on the number of free spaces within the maze - even if the start position has multiple corrals, it might be that we can quickly identify a set of "spread out" box squares in the maze for which there is a single corral. This gives us (one or more) estimates of how many boxes we can fit into a spread out arrangement with only one corral.

Now consider again pulling out a single box at a time from the solved position. We may be able to reach a previously identified start square of one of the boxes or a square with the box now on one of the "spread out" box squares with a single corral (even though the current position with one box pulled out may not have a single corral). But the box may not reach one of our targets; if not, we can make a note of its "pull access" (the generalisation to pulls of the push access). Perhaps we can re-designate a target within the push access to satisfy the same conditions. If not, we can then consider it to be interfering with our ideal situation, so we remove this box. By applying this approach successively, removing the box if we can't reach a target, or redesignate a target appropriately, we get an estimate of how close to our "spread out" arrangement we can get in the simplest possible fashion, with successive moves of a single box.

We then attempt to reinstate the removed boxes into "parked" squares (one of the squares within the pull access) with the least amount of interference with pulling out the other boxes. Ideally, we'd like to pull the offending boxes to a parked square, carry on pulling the other boxes out to the previously identified squares, and intersperse moves pulling the offending boxes from the parked squares to their designated targets.

Any time we are successful with some part of this program, we get potentially useful information about parts of the original subtask (pull out all boxes to spread out squares) that we can easily manage. Even when we fail, we have potentially useful information about how difficult the original subtask was.

Reverse search corral merging


In many of the XSokoban puzzles, the goals are within a single room, and many of the boxes are frozen. There may be more than one corral in the solved position, or pull access restrictions may restrict the rooms, or squares within rooms, to which the boxes can be pulled. These conditions may of course change dynamically as boxes are pulled out.

In the case of more than one corral in the solved position, it may not be obvious which corral the man should start in. We can first start pulling out boxes according to the above suggestions; however with multiple corrals we may end up pulling out a group of boxes into one corral, and then failing to move out more boxes as that corral fills up and we haven't opened access to the boxes in another corral.

But we can still use the information we obtained about how many boxes we dragged out; we can compare it with the minimum number of boxes we need to remove from the solved position so as to open man access to another corral. This is another box-removal heuristic. It is analogous to the description of "essential" boxes to open up corrals in the previous section (though with many boxes frozen on goals the number of "essential" boxes to remove may be much larger).

In some cases, the corral to start in can be simply determined by the maximum number of boxes we were able to pull out; the more the better. But in general, having got the estimate, we have to recast our subtask of dragging boxes out to "spread out" squares by preferentially dragging out "essential" boxes. With many frozen boxes, there might be many different candidates for "essential"; so it might be helpful to make a dynamic count of the minimum number of boxes we can remove to merge the corrals, so we can at least tell if we are making obvious progress to merging by pulling out a particular box.

Reversible moves


In establishing "spread out" squares for boxes, we can first aim for "reversible" arrangements. Here, if a box can reach one of the "spread out" squares, then it can reach any of the others and vice versa, either pushing or pulling that single box with all other boxes (apart from those not moved yet from their goals) removed. This is obviously complicated by mazes with push and pull access restrictions, so we'd have to split our arrangements up into sections according to whether those areas are reachable or not. The point of reversible arrangements is that, if we can pull a box out to one such square, we can pull it out to any one, and there is a likelihood of the reversible property being maintained even when we pull out a second box. Thus, if the pulling out of the first box conflicted with the second, we could move the first box on to another "spread out" square to try to remove this conflict. And we could use the reversible property to change the order of moving up boxes; maybe to get the fourth box out we need to get the third box out, then move up the first and second, then move up the third to clear a route for the fourth. There will be an element of "follow the leader" whereby the second box follows the first around until we find an arrangement which doesn't conflict with pulling out the third box, and so on. We are aiming to locate an "easy win" whereby several boxes can be spread out without interfering with each other. Once again, we don't need to find all ways of doing this, one is "good enough".

Identifying any such "easy win" is useful even if it only applies to a limited number of boxes. To generalise, we can try pulling out our boxes, and if they are not amenable we can delete them to see if they facilitate an easy win with following boxes. This then provides motivation to try to find a way to "park" the obstructive box so as to facilitate the easy win. The idea is to divide up the boxes into groups with easy wins and obstructions. We try to solve obstruction problems with exhaustive searches (to some limit) but we severely limit search for easy wins. In some sense, we're trying to divide up the solution into "easy bits" and "hard bits" with exhaustive searches for the hard bits and heuristically limited searches for the easy bits. Progress is indicated by a successful subtask. Failure requires a reassessment of priorities in the whole search method. Having lower bounds available on the amount of free space, the number of boxes in a particular area with a reversible arrangement, etc, can help the program assign a likelihood of success to a subtask, which helps with decision-making; both with decisions whether or not to pursue a search to solve a subtask and what to do after an initial failure - if the task is low probability we might want to abandon it and try to restructure our solution without it, but if it is high probability we might try searching more deeply. The likelihood of success could then be adjusted by how much search effort has been put into it already.

Generalised deadlock detection


External corrals are usually tested for deadlock by seeing if they can be opened up (reduced to single corral) with the rest of the boxes removed. Since boxes on goals may not necessarily need to move again, the criterion is broadened to pushing out a box which is not on a goal (apologies if the science of deadlock detection is more advanced than my brief description of it here). This should work well in simple cases where boxes put on goals immediately become frozen and cannot move again. But when this is not the case, stacking order restrictions may dictate that a box put on a goal which is not frozen may only be parked there temporarily, to later move, perhaps to another goal. This is not taken into account in conventional deadlock detection since stacking order restrictions are not included. But suppose that a reverse search has uncovered stacking order restrictions and succeeded in pulling the boxes out to a single-corral, spread out position; in a search broken into chunks (so not exhaustive) it would be useful to generalise deadlocks to take into account discovered stacking order restrictions. This is done simply by testing the position for boxes on goals and comparing against the stacking order restrictions; these might take the form of putting goals into groups which are ordered, so that all the goals in one group have to be "finally stacked" (with boxes that then never move again) before the goals in the next group are then finally stacked, and so on. Thus, if the last group have not all been stacked, we regard all goals outside this group as temporary parking positions for boxes which may have to move again; thus we can generalise deadlock detection by clearing these goals from corrals being searched for deadlock (provided they are not occupied by frozen boxes, which would be regarded as a deadlock. The assumption here is that the frozen goal would have been included in the last group if meant to be finally stacked.). There may be a further generalisation of stacking order restrictions to take into account particular box groups as well as goals, when the push access restrictions are such that not all the boxes can reach all the goals at various stages of stacking; so it then might be vital not to stack a goal with a particular box as that one has to reserved since another box cannot later be used to reach another goal otherwise. I'm not sure how to include that in generalised deadlock detection, or if it needs to taken into account elsewhere in the search. Also note that, where a solved position has multiple corrals, we can run forward searches to "synchronise" with a different stacking order for each corral (when found). Linking the forward and reverse searches this way (which is effectively doing a lot of pruning in the global tree with regard to stacking moves) may substantially increase the chances of solving the puzzle, with the usual drawback of relying on heuristics that may not be effective for all puzzles. Remember the N factorial which describes the number of unordered stacking moves.

Chapter 3 Algorithms and definitions

========================

In this section I attempt to make precise (programmable) some of the ideas in the previous sections.

Suppose that there are N goals. Let P be a Sokoban position.

Isolated square (definition)


A square in P is "isolated" if the eight adjacent squares (horizontally, vertically and diagonally) are all empty.

A-empty (definition)


A square is P is "A-empty" (access empty) if it is either empty (possibly containing a goal but no box) or contains the man. Squares that are not A-empty contain either boxes or walls (possibly boxes on goals).

Man access corral (definition)


A dictionary with keys squares S belonging to P and values -1 (if the man has no access to S) or 1 (if the man does have access to S). Can be implemented iteratively or recursively, starting with the man's square and moving to adjacent squares horizontally and vertically. More generally, external corrals could enter into the dictionary with values of integers above 1 (one for each corral). The A-empty squares in a corral are called the "corral access".

One-box maze (definition)


The maze M was defined in Section 2 (delete all the boxes from P to get M). For a square S of P which does not contain a wall, the one-box maze for S is the Sokoban position OBM(S) obtained by adding to M a box on square S.

Dead squares (definition and algorithm)


These are the A-empty squares from which a box can never be pushed to a goal. They are found by iterating over A-empty squares S, forming the one-box maze OBM(S), and computing the push access PA(S,PS) for each push square PS of S in OBM(S). If none of the PA(S,PS) contain goals, S is a dead square. This algorithm and concept is well-known but is reproduced for familiarity of the concepts in other algorithms below.

Frozen boxes


A box B such that: (a) no pushes of B are possible due to blocked push squares (at least one of the left and right squares is blocked by a box or wall, and at least one of the up and down push squares is blocked) and (b) Each box adjacent to B on a push square such that the opposite push square is not a wall satisfies (a) and (b) in place of B is "frozen". Evidently, frozen boxes can never be pushed again and a frozen box not on a goal is a deadlock. This is the familiar, recursive definition (apologies if I oversimplified it), and generalizing is problemmatic for boxes with pushes since pushes which currently lead to deadlock might not do so after other boxes have moved. Even so, here is a generalization. A box B is "corral deadlock frozen" if either it is frozen or every push P of B satisfies either: (c) After deleting all known frozen or corral deadlock frozen boxes, except B, P is a detectable deadlock, or (d) B is a boundary box of an external corral C. With all other boxes outside C removed, the push square to perform P is man-accessible and after P, the position is a detectable deadlock. Moreover each of the other boundary boxes of C satisfies (d) in place of B. Note that the definition is iterative as we start with frozen boxes and may uncover some corral deadlock frozen boxes with each iteration, so we may need to check each box several times, each time the number of deadlock frozen boxes increases. As well as extending corral deadlock detection one ply, it also applies to frozen corrals with all boxes on goals, which is not a deadlock, but may enable detection of deadlocks as other boxes are pushed adjacent to the corral.

The definition of frozen extends to reverse mode in an obvious way. Pull frozen arrangements often have boxes two pull squares apart and are often a feature (obstacle) in stacking problems. Corral deadlock detection also extends to reverse mode in an obvious way, with the man inside the corral unable to pull his way out. With goals moved to the start position one must exclude from deadlock arrangements with all boxes on goals exactly as for the forward mode. Further note regarding frozen and corral deadlocks that straight tunnel sections can be treated as one square long to generalize.

Reverse exclusion zone and push-reachable zone (definition and algorithm)


In trying to solve a puzzle in reverse by pulling boxes, we must avoid dead squares, since a position reached in reverse with a box on a dead square can never be reached from the start position by pushing boxes (since we exclude the detectable deadlock of pushing a box onto a dead square). In other words, allowing this would prevent the forward and reverse searches from being able to meet. We can generalize this idea to the "reverse exclusion zone". In addition to the dead squares, the reverse exclusion zone is defined to include all A-empty squares S (in the start position SP) such that S does not belong to the SP "push-reachable zone". The algorithm for determining the push-reachable zone PRZ for SP is as follows. Let PRZ be a null list of squares initially. Iterate over the squares B in SP containing boxes. Let OBM(B) be the one-box maze for B. Compute the push access PA(B,PS) for each push square PS of B in OBM(B). For each square S2 in PA(B,PS), if S2 is not in PRZ add S2 to PRZ. Terminate returning PRZ when we've finished all the iterations. A typical example of a square in the reverse exclusion zone which is not a dead square: consider a T-intersection whose single exit (not the paired exits in a line) turns into a bent tunnel which has another exit. Assume the first square S in the bent tunnel adjacent to the T-intersection is A-empty in the start position SP, and that the T-intersection and adjacent section of the bent tunnel (including the corner) contain no goals. Because of the T-intersection, it is impossible to push a box onto S (a wall is on the push square). Yet, in a suitable OBM it is possible to pull a box from the T-intersection onto S, and because of the bent tunnel's other exit also possible to pull a box from elsewhere onto the T-intersection and change the man position ready to pull it into the bent tunnel. So S belongs to the reverse exclusion zone without (necessarily) being a dead square.

Goal trap (definition and algorithm)


A trivial example of a goal trap is a goal in the corner of a room: once a box is pushed onto the goal it is frozen ("trapped"). The notion of goal trap generalizes this idea to the case where the box is not frozen, but cannot reach all of the push-reachable zone PRZ for the start position SP. To determine the goal traps, iterate over goals G in SP. Form the push access PA(G,PS) for each push square PS of G with a push from G in the one-box maze OBM(G). Merge these push-accesses for G into a list GT(G) (the goal trap). Whenever GT(G) is just PRZ it conveys no further information. However, if GT(G) contains fewer than N goals (say N2 goals), this means that any position with more than N2 boxes on squares in GT(G) is a deadlock, since only N2 of these boxes can ever be stacked (on the N2 goals in GT(G)); the remaining extra boxes cannot reach any of the remaining goals since they can only reach squares in GT(G). A typical and simple example would be a goal on the wall of a room where that wall contains no exits and meets directly with two corners of the room (so the whole wall is entirely horizontal or entirely vertical). It is okay to push one box onto this wall, as it could later be stacked on the goal. But it is no good to push two boxes onto this wall, since only one can be stacked on the wall's goal, and the other is then stranded in the goal trap and cannot reach any of the other goals in the puzzle. Thus, the numerical information conveyed by the goal traps is the number of goals they contain, and the implied restriction on how many boxes can be in the goal trap at any time without causing detectable deadlock. Note that corner goals have empty goal trap, conveying no information, but goal traps in general can exist with higher capacity but without containing all N goals.

The goal traps may also convey restrictions for the reverse search. If there are NB (more than zero) boxes inside a goal trap in the start position SP, the reverse search cannot be allowed to remove from the goal trap so many boxes that there are fewer than NB boxes in the trap, otherwise the forward and reverse searches cannot meet. As an example, with the goal on a wall as described above, if the start position has a box on the wall the reverse search cannot be allowed to remove all boxes from the wall.

Reversible regions, sources and sinks


The notion of a directed graph (digraph) of push-reversible regions, with numerical restrictions on the number of boxes within each region, generalizes goal traps. The push-reversible regions are got by restricting push access in one-box mazes to squares to those with a push route back to the starting square. This provides non-intersecting reversible regions which contain all A-empty squares other than dead squares. The digraph is obtained by having a vertex for each reversible region and an arrow from region A to region B if and only if there is a push-route in a one-box maze from (any square of, taken to be the first by reversibility) A to B. A "sink" in a digraph is a vertex with no outgoing arrows (a "source" has no incoming arrows). Note that all sink regions must contain goals to avoid being dead squares. Thus if a sink region contains M goals, any position with more than M boxes in that region is a deadlock, generalizing the forward restriction in goal traps. More generally, for each sink S, count up the goals G in regions which have a route following the arrows in the digraph (hence a push route) to S (including S). Count up the boxes B in these regions. If B is less than G (as could happen if boxes are pushed to a region with no push route to S) the position is a deadlock. It may not be a deadlock if B exceeds G, provided G is less than the total # of goals N, because the extra B-G boxes might be pushed to other goal regions with no push route to S. Similarly, for a source S, if it has more goals than boxes the position is deadlocked. Perhaps the reverse search can have generalized restrictions - not sure what they should be.

These ideas of reversibility and digraphs to show directions for pushing are "static" in that they apply to one-box mazes though the numerical restrictions they imply can be applied to "dynamic" positions. Further dynamic restrictions could be applied once a stacking order is obtained; frozen boxes during stacking could subdivide the static reversible regions, where the frozen boxes are added to the one-box maze to find further restrictions. Such dynamic restrictions could signal "relocation" subtasks - moving a certain number of boxes from one dynamic region to another before a particular phase of the stacking is completed. This would be of great importance for a limited search as the success of the reverse search in finding a stacking order would then automatically generate these relocation subtasks, potentially enabling the solution of puzzles with too much complexity for conventional search. More on relocation later.

Note for conventional searches and Sokoban programs


Reverse exclusion and goal traps (perhaps even static reversible regions) should be implemented for offline deadlock detection as an aid to the human solver; whether goal traps should be included in a conventional computer solver (reverse exclusion certainly should be included in the reverse search) depends on the cost-benefit analysis for the solver - a judgement of how much the overhead of checking every position against the goal trap restrictions will slow the search. In my scheme of a non-exhaustive search, all three concepts would definitely be included because even lengthy deadlock detection can be valuable in a limited search; with some memory overhead (storing box locations within traps and reversible regions for each position and updating this data when boxes move) goal trap restriction can be implemented at relatively low time cost during search, though reversible region restrictions would require an offline computation of push routes between all regions that support them so the restrictions could be implemented at relatively low time cost during search.

Depth Corral (definition)


A dictionary with keys squares S belonging to P and values non-negative integers indicating the access depth. Unassigned squares have value -1. Implemented in following algorithm. More generally, called an "access corral" if the values have some other meaning (cf. RFS below).

IADOL (Iterative Access Depth Ordered List) algorithm


Parameters: P, dead squares DS in P, Man access corral M in P, depth D, depth corral DC, DL (depth list of squares belonging to M), TL (total depth ordered list of squares). Implementation: Null list of squares SL. Iterate over each square S in DL. For each square that is adjacent horizontally or vertically to S, and belongs to M, does not belong to DS, and whose depth in DC is unassigned, add that square to SL and to (the end of) DL. If SL is still null, terminate returning TL, DC and D. Otherwise recurse with depth D+1 and DL replaced with SL. Usage and interpretation: Initially called with DC values all unassigned, D zero and DL a list of boxes (intended to be boxes on goals in a solved position which can be pulled from the man's corral). Each square in DL is given depth 0 (value 0 in DC). On termination D is the largest depth found and TL a list of squares in M in order of increasing depth. The depth is interpreted as distance in man moves from all the boxes in the initial DL. Idea: aim to pull the boxes out as far away from the solved position as we can, avoiding dead squares; start by depth ordering empty squares.

CA (corral access for circuits) (definition of circuit and algorithm to compute the # of circuits and circuit access) algorithm


This algorithm also computes connected occupied regions with access lists.

Compute access corrals in a dictionary D for a Sokoban position P. Let Q be a copy of P. Let NCR be the number of connected regions of occupied squares, initially zero. Let NC be the number of circuits, initially zero. Let NOL be the list of occuped squares, initially null. Iterate over squares S of P, skipping those that are A-empty or already in NOL. When an occupied square S is found, add one to NCR, add S to NOL and find the connected occupied squares (recursively or iteratively, spreading out horizontally, vertically and diagonally from the initial occupied square to adjacent occupied squares) not already in NOL, adding each square to NOL and marking the square in Q with the number NCR and adding these squares to a list CRL(NCR) (connected region list for region # NCR). Set NAC (number of access corrals) to zero and a list CRAC(NCR) to null (the connected region access corrals for region # NCR). Now iterate over CRL(NCR) looking for adjacent A-empty squares (horizontally and vertically only). Use D to identify their access corral number. If not in CRAC(NCR) add it. For completeness we can complete CRAC(NCR) but for the purpose of finding circuits (defined as those with len(CRAC(NCR))=1) we can terminate when len(CRAC(NCR))>1. For each circuit, we add one to NC and set CA(NC) (corral access for circuit # NC) to be the single access number in CRAC(NCR). By this means we identify the access circuits. We are usually interested in those for a fixed access number (usually the man-access corral) so we can iterate over NCR to find these.


RFS (Reverse free space) algorithm


Produces a (heuristically) decreasing accessibility-ordered, then lexicographically decreasing depth-ordered, arrangement of extra boxes BL in free space (no increase to the number of corrals).

Step 1


Let DL be a list of squares in P containing boxes on goals which can be pulled at least one square (without forward-relative detectable deadlock). Call IADOL with D=0, unassigned DC, TL a copy of DL. Returns a maximal depth D, increasing depth-ordered list TL and depth corral DC (which for each key S in TL has value the depth of S). Let Q be a copy of P which may vary. Compute the pull access squares for all the boxes in DL. replace TL with an equivalently-ordered sublist containing just those squares belonging to at least one of the pull accesses for the boxes in DL. Idea: aim to pull out the DL boxes as far as we can, but the squares we fix to pull them out to must be pull-accessible, to at least one box.

Step 2


Start with a list of A-empty squares L, and null modified total list MTL of squares (from TL). Iterate through boxes B in DL in a fixed order (assumed not to matter). Compute the pull access for B in Q (assuming Q is not P, otherwise we already did it). Iterate through squares S in TL in reverse order (starting at maximal depth). If S is not in B, skip to next S. Otherwise, if S is not isolated in Q, add S to MTL and skip to next S. Otherwise add a box at S to Q, add S to L and skip to next S. Result: Q stores a position with a number len(L) of extra boxes, each of which is surrounded by empty space, whose squares belong to L. Idea: we've found a depth-ordered arrangement of isolated boxes in free space; as boxes are isolated we can't have increased the number of corrals, though we'll have increased the number of circuits by len(L). This arrangement is particularly accessible as the boxes are isolated.

Step 3


Replace TL by MTL (to look for more box arrangements with more access restrictions). Let BL be a copy of L. Let L be a null list of squares, similarly MTL. Compute the pull access for B in Q. Iterate through squares S in TL in reverse order (starting at maximal depth). If adding a box to S in Q would create a frozen-box deadlock (disregarding the fact that there may be more boxes than goals!) or add to the number of corrals rel. to P, skip to next S. Otherwise, if adding a box to S in Q would decrease the number of circuits rel. to the previous value of Q, add S to MTL and skip to next S. Otherwise add a box at S to Q, add S to L and skip to next S. Result: Q stores a position with a number len(L) of extra boxes, compared to the value of Q at Step 2. Adding these boxes created no frozen-box deadlocks and didn't increase the number of corrals or decrease the number of circuits. The arrangement is depth-ordered and only marginally less accessible than the one at Step 2; we may have adjacent boxes or boxes adjacent to walls.

Step 4


Replace TL by MTL (to look for more box arrangements with more access restrictions). Let BL be a copy of L plus the previous BL. Let L be a null list of squares, similarly MTL. Compute the pull access for B in Q. Iterate through squares S in TL in reverse order (starting at maximal depth). If adding a box to S in Q would create a frozen-box deadlock or add to the number of corrals rel. to P, skip to next S. Otherwise add a box at S to Q, add S to L and skip to next S. Result: Q stores a position with a number len(L) of extra boxes, compared to the value of Q at Step 3. Adding these boxes created no frozen-box deadlocks and didn't increase the number of corrals. The arrangement is depth-ordered and possibly less accessible than the one at Step 2; we may have reduced the number of circuits (perhaps to zero). The algorithm terminates after this step, with the last L added to BL.

Step 5


Repeat Steps 1-4 with external corrals, if they exist, moving the man into each external corral in turn, and calculating free space relative to pullable boxes in each corral.

The next stage for reverse spreadout is to find an order of accessible pull routes pulling out the DL boxes to (some of) the assigned free space squares in RFS. In other words, a reverse variation moving the DL boxes into free space. There may be more or less free space squares from RFS than there are boxes in DL. If more, we bias towards the first len(DL) of the free squares (those with the highest depth and, heuristically, accessibility), holding the remaining free squares in reserve for shuffling manoeuvres. If less, we obviously can only pull out len(BL) boxes at best. However, in the process of doing this, we may open up access to boxes which may have more access to free space in the new arrangement. For example, there might be two corrals in a solved position, and let's suppose all the goals in one room. From one corral (assumed to be one for which the puzzle is reverse soluble) we may not have free space room in the corral to pull out all the N boxes. However, we may have free space enough to pull out enough boxes to make the puzzle have a single corral, which then increases the free space access to the squares which were initially in the second corral, allowing the puzzle to be solved.

RPR (Reverse pull routes, also known as reverse spreadout) algorithm


This is the main algorithm intended for use in the reverse search, for pulling boxes out to a "spread out" position with a single corral and for establishing a stacking order.

Step 1


Start with null list of pull routes V. null list SBL (static box list) and a solved position P (copied to Q which may vary) with pullable boxes PB and apply RFS to get an ordered free square arrangement BL within the given corral. In any (arbitrary) order iterate over the pullable boxes B in PB (note that PB will dynamically vary rather than being fixed; it may increase though never decrease in size). Look for the first square S in BL which belongs to the pull access for B. If there is no such square S, despite there being empty squares in BL, call algorithm RSB (below) to see if we can rearrange previous boxes in free space to find an S that works. If still no S works, skip to the next B, adding B to list SBL. Otherwise compute a pull route to S and add it to V.

Step 2


When we find a pull route to add to V, and change Q by moving the box B along this pull route to free space, we must check whether any new boxes are now pullable (in addition to those in PB) as a result of this move. If so, we add this list of new pullable boxes onto the front of PB (heuristically considering it more important to pull out boxes that were previously inaccessible). We check whether the number of corrals in Q has decreased compared to P. If so, we must first check if there is only one corral and all boxes not moved yet are not (push-)frozen on goals. If so, we terminate the algorithm (returning True for success, V and SBL.). Otherwise, we skip to the next B in Step 1.

Step 3


Eventually we reach the end of the (dynamically varying) PB. So every box has either been moved to a free square or, with no such move being found, had its square added to SBL. Since we didn't already terminate, we must have failed (either there is more than one corral or some boxes are (push-)frozen on goals). So we terminate the algorithm and return False for failure, along with V and SBL to see how far we got. In future, this output could be used to feed other algorithms which try to learn from the partial success achieved (measured in V) and get round the problems represented by SBL. For example, in the two-corral puzzle mentioned above, if free space in the initial corral is limited, it might be necessary to leave unmoved several boxes which could be pulled, and prioritise pulling out previously inaccessible boxes in some special order, so as to fill up the available free space in the first corral at just the moment when access to the second corral is granted with the last box pulled out. A tree search algorithm, scoring previously inaccessible boxes higher, might work - details later.

Note: more generally, we could signal different degrees of success and failure. For example, if we found a position with a single corral, but some of the boxes BL are still push-frozen on goals, (yet some progress has been made in pulling out boxes and/or reducing corrals) we could set this as the aim of the forward search, with the proviso that we may finally stack any box on the goals in BL at any point, but no other boxes are allowed to get frozen on goals not in BL. This generalises the notion of deadlock to take into account stacking order restrictions. The forward search then aims for a position with one corral the only frozen goals in which belong to BL. There would then be an intermediate stage matching up the forward and reverse search positions, either by stacking more boxes onto goals in BL or moving a box from a square in the forward search to one of the squares having boxes in the reverse search. See later for algorithmic details.

RSB (Reverse shuffle boxes) algorithm


We are given a position P, with given man corral, a free space arrangement BL, some of whose squares FL are already occupied by boxes, and a box B which currently can't be pulled to a square in BL despite there being empty squares in BL. Our aim is to rearrange the boxes in FL to squares in BL so that B can be pulled onto an empty square in BL. If so we return True for success and the variation rearranging the boxes and moving B.

Step 1


Iterate over squares S in FL (currently covered by boxes in P). Copy P to Q, and delete the box on S. Compute the pull access for B within the given man corral in Q. If it now contains an empty square in BL (including the newly-empty one S), make a pull route to the first accessible square in BL, call it V, and go to step 3.

Step 2


Skip to the next S. If we run through all S in FL, terminate returning False for failure and a null variation.

Step 3


Make a null list of pull routes W. Iterate over empty squares E in BL other than the destination S2 of V. Let R be a copy of P. Delete from R all the boxes in FL except for S2, where we add a box. If there is a pull route PR from S2 to E in R, make a list L of squares in FL which intersect with PR (here meaning just the squares the box is pulled over, ignoring the man moves), in order of increasing size of the variation to that square in PR, and go to step 4. Otherwise go to step 2.

Step 4


Let P2 be a copy of P. Iterate through squares SB in L in reverse order; for the first one, try to find a pull route to E in P2. If not, go to step 2. If successful, add this pull route to the front of W, and move the box in P2 from the start to the end of this pull route. Now replace E by SB and skip to the next SB in L in reverse order. If we were successful at every SB in L (otherwise must have returned to step 2) our W contains a sequence of pull routes shuffling the boxes in L to free space. Add the pull route V to the front of W. Notice that the position of P2 and P match in terms of V applied to P giving the (last) position P2. Terminate returning True for success and W.

Note: RPR is intended as a quick-and-dirty algorithm which will terminate after a few seconds with modest memory overhead for a medium (and even quite large) puzzle. It finds the first free space arrangement it can (using a distance heuristic, doesn't look for alternatives), moves boxes only by single-box pull routes direct from goal to free space (no intermediate stages and, except for shuffling, the boxes on free space don't move again) and pulls out boxes in an arbitrary and fixed order (doesn't look for alternative orders). It does some shuffling when no direct pull route is found, moving around the free space boxes it has already pulled out to make space for another box, but this shuffling is also limited to single-box pull routes between free space squares and an arbitrary fixed order of boxes and squares found from which ones intersect a (phantom) pull route with free space boxes deleted. It does try potentially several phantom pull routes, but sequentially rather than in a tree, and keeps only the first one for which shuffling was successful. RPR calls RFS and RSB. RFS also calls IADOL.

Such an algorithm will have extremely variable results. It is intended as the first stage of a reverse search, to identify "easy progress" towards the ideal reverse spreadout position with a single corral and no push-frozen boxes on goals. I envisage it being followed up by more lengthy searches to address its shortcomings. For example, a tree search which allows the SBL boxes to move by individual pulls one square at a time, (or not at all except via pull routes direct to free squares if this can reduce len(SBL); a subsequent search could allow boxes in SBL to move by individual pulls once we've minimized len(SBL)) other boxes to move only by single-box pull routes direct to the free space identified by RPR, (only to the first accessible square in the ordered list) and then not to move again with the exception of RPR shuffling. Such a tree search could terminate after a fixed overhead of time and memory. The idea is to make "easy progress" in between more exhaustive searches at the sticking points.

More precisely (sketch of algorithm RPRT, RPR with tree search): (a) RPR produces variations moving goal boxes to free space, possibly via shuffling, returning a list SBL of unmoved boxes. (b) tree search using RPR pull routes direct to free space or via shuffling scores a positive constant times the sum over boxes on free space multiplied by pull layer depth of the box moved there (discounting intermediate shuffling variations - we'd need to keep track of where each box came from by numbering boxes within layers) (pull layer depth is the # of layers of boxes initially pullable from the given man access which have to be iteratively deleted to give access to successive layers of pullable boxes.) SBL boxes which are push-frozen score minus pull layer depth times a positive constant. External corrals score a large negative constant. This may decrease the # of push-frozen boxes in SBL and the number of external corrals. (c) tree search as in (b) but remaining push-frozen SBL boxes (only) get single pulls. This may further decrease the # of push-frozen boxes in SBL and the number of external corrals. (d) tree search as in (c) but all remaining SBL boxes (only) get single pulls (in case the non-frozen SBL boxes need to be manoeuvred out of the way with single pulls to permit more frozen SBL boxes to reach free space). This may further decrease the # of push-frozen boxes in SBL and the number of external corrals. Note that the rationale behind this sequence is to limit exhaustive search (single pulls) by using the smallest # of push-frozen boxes in SBL obtained in the previous steps.

Note that if RPR combined with the tree search(es) is successful, it provides a stacking ordering (actually, with variations and shuffling, even more information than this). This ordering information can then be used to generalise deadlock detection in the forward search (see the previous section) and prune stacking moves that don't correspond to the given order. Even when RPR-plus-trees fails, it may provide a partial stacking order: the goals in SBL can be stacked in any order first, and then the remaining goals in the determined fixed order.

Forward search


Analogous algorithms to RPR and tree searches can be applied to the forward search. Take the pushable boxes with an accessible push square in SP to be the push analogue of PB. The initial aim of forward search is to open up the position to a single corral. The tree search score heuristic could value initially inaccessible boxes higher, as with the reverse search. A large fixed score penalty for each external corral would favour the aim. The way corrals are divided up into free space may be a little different since we would have the free space in a corral from the corral's pushable boxes accessible within the corral, but we'd also have free space from pushing in boundary boxes from a neighbouring corral (the latter combined for all neighbours is the usual reverse free space for the corral).

Relocation subtasks - reversible region digraphs


Assuming the reverse search succeeds in producing a single corral and a complete stacking order, and the forward search succeeds in producing a single corral, we next apply the dynamic reversible digraphs for each stacking order position which results in frozen (or possibly corral deadlock frozen) boxes on goals. We use the reverse variations to identify how many boxes are in each dynamic reversible region at each stage. We then want to match up the forward and reverse variations as far as these region numbers go. We can use free space, RPR and tree search, with fixed score penalties within each region with box number differing from the aim. We could try both forward and reverse relocation in stages matching up the most progress in either direction.

Relocation subtasks - matching forward and reverse positions


Assuming the dynamic reversible relocation is successful at the stage comparing reverse and forward single corral positions before (final) stacking takes place, the next stage is to exactly match the forward and reverse positions. This could be done with the same relocation algorithms, with box positions for the other search ("box aims") added to free space, and score bonuses for positions with boxes on box aims. Note: the rationale behind separating relocation into two stages is that it is easier to match any arrangement of a # of boxes within a region than to match exact positions, but once the regions have the right #s of boxes it may be possible to rearrange them mainly within regions for an exact match - again we're trying to limit the amount of exhaustive search (single pulls). It is also useful to see how far we got at each stage in case of only partial success - further heuristics and algorithms could be tried at each stage.

More on Relocation subtasks


I envisage relocation being carried out with no external corrals for ease of access. However, the possible existence of circuits could make relocation even easier. I don't have a precise alternative algorithm currently making use of circuits. Relocation can still be carried out with external corrals, but more access restrictions make it less likely to succeed.

One-way doors (definition)


A "static one-way door" (s1-door) in the maze M of a puzzle P consists of empty squares C, O, E such that: (a) O and E are man-adjacent to C at right angles (b) with a box added to M at C there is a push to O (c) with a box added to M at O there is a push to C (reversibility) (d) with a box added to M at E there is a push from C Note that (b), (c) and (d) require further empty squares either side of O (OP), C (CP) and E (EP) with all six of these squares distinct. (e) none of C, O, E, EP are dead squares in P (f) with a box added to M at C and at O, and the man at CP, the man access does not include OP or E. With the man moved to OP, the man access includes E (g) with a box added to M at C and at OP, and the man at CP, the man access includes O but not E (h) with a box added to M at C and at E, and the man at CP, the man access includes O and EP

Note that (h) is possibly redundant, implied by (a)-(g), but included for clarity. The case of a two by three rectangular room with perpendicular exits, suitably arranged, gives rise to a s1-door with E the wide side exit and OP the other exit. C stands for closed, O for open and E for exit. Perhaps there should be a further condition to indicate that trying to traverse the s1-door the wrong way gives deadlock. The problem in general is such a deadlock might not be detected - though it might in the case of 2x3 rectangular rooms.

One-way system (definition)


A "simple static one-way system" (s1-way) W is an ordered list of at least one s1-door(s) such that: (a) if s1-door D'=(C',O',E',CP',OP',EP') follows immediately after s1-door D=(C,O,E,CP,OP,EP) in W, with boxes added to M at O and O' and the man at C, the man access includes OP' but not OP or E' (b) if D is the first s1-door in W and D' the last then with boxes added to M at each of the s1-door O (open) squares and the man at OP, the man access includes C'. Note that with boxes as in (b) there are len(W) - 1 external corrals inside W and the man can traverse W by pushing each box from O to C (open to closed) and back again in turn, leaving W in the same state.

Provided the traversing variation is left available, one can for some purposes treat the boxes in W as if they weren't there, leaving a single corral, as regards man access in a puzzle including W with boxes as in (b). See "one-way access", below for a precise description. Note: a s1-way might have some of the O-boxes away from the O-squares (on the C squares for instance). To make things simple, so that W doesn't change state when traversed, we'd first traverse W pushing the O-boxes onto O-squares if this is easy - if not W is not correctly occupied for traversal, and if not deadlocked we'd first have to find variations to correctly occupy W. If W is completely empty, variations to fill it would normally be easy, pushing the last O-box in first, and then using the W traversal squares to push in the other O-boxes in reverse order, only provided the first OP square is accessible. But this requires that W is push-traversable in entirety, which may not be true in general. The definition only requires man-traversability and reversible pushes between the adjacent O and C squares. And W might be partially, excessively or incorrectly occupied before we get a chance to use it, and even in the static case with 2x3 rooms the variations to correctly occupy W may not be obvious, due to the possibility of creating a deadlock due to W-external boxes if trying to push a box out of W. Note: There may be more general one-way systems or traversable corrals utilising more than a single reversible push of one box to move between corrals. There are, for instance, "valves" which function something like s1-doors but require two boxes to move. There can also be traversable corrals which have a different state (due to boxes not moving back to their original positions) after traversal. There can be intertwined (overlapping) one-way systems. None of this complexity is being addressed here. A computer solution of medium to large puzzles featuring even standard 2x3 room 1-ways significantly, many of which will have too large a tree for conventional solvers, I consider would be a significant advance.

Double-box s1-door


If a position P has two boxes occupying an s1-door (such as one on O or C and one on E) it can't be used for one way traversal until one box (like the one on E) is exited. If the s1-door is push-traversable even with two boxes, by push-reversing one box and pushing out the other to free space, simple variations should normally achieve this converting it to an occupied s1-door. If the s1-door is immediately followed by a vacant (in P) s1-door so the two form an s1-way, using the second O-square as free space makes the s1-way occupied and traversable, thus increasing access (see one-way access, below) without increasing free space occupancy, so it would perhaps be worth doing this in RPR(T). One-way access (definition and algorithm)


If WL is a list of s1-doors (belonging to one or more s1-ways) with boxes on the O squares, make a dictionary D with keys OP squares and values EP squares. When calculating man access to push a box outside the s1-doors around a push route, when a square in the D keys (which can be more efficiently checked with another dictionary) is accessed, the value is used to add the EP square to man access - this would be accessible by traversing W. The extended access is "one-way access" (1-access) relative to WL. The access can be calculated twice, first without the WL "teleports" to avoid unnecessary use of WL. The relevant variations to traverse the s1-doors are also stored ready to insert into push routes when needed. Note that it's not important which s1-way we use to achieve access (use an optimiser to find the minimal choice).

Note: s1-doors and s1-ways obviously apply to the reverse mode also, with the order in W reversed and pulls replaced by pushes. Note: normally we'd only use the beginning OP to "teleport" the man to the end OP, but in some more general s1-ways than those with 3x2 rooms connected directly by straight tunnels there might be other boxes than the O ones within W so we might want to move them. It is also possible the definition could be extended to dynamic 1-ways (where some of the access obstacles are boxes rather than walls, though it could be problemmatic to work out the level of occupancy by boxes of the 1-way) where other boxes than the O ones within W being moved might change access restrictions in W and it might be possible to move such boxes between different s1-door corrals. Note: it might seem redundant to use one-way access from a position where W is not "end-blocked" (the first and last OP squares are both accessible). However, consider pushing a box outside W around a push route; the box might end-block W en route, and access to change push direction might then require traversing W. This could even occur with W only containing a single s1-door, and thus not contributing external corrals. In this special case using W actually increases access for push (or pull) routes in the single corral situation envisaged for RPR(T).

Assuming that trying to push out O-boxes from W in a puzzle P directly produces deadlock, a human solver can use the 1-way access to rearrange other boxes (temporarily end-blocking W and traversing W to regain it) until clearing W becomes possible. Thus W functions as an adjunct to free space in terms of computer solving, just as arrangements with doors or multiple non-deadlocked corrals may do. I think W traversals, along with rearranging other boxes and temporarily blocking access to the end of W generalises shuffling, though I haven't got a precise algorithm yet. More work is needed on the details of s1-ways before they can be added to SOLVE. At present they could only be used in the case of single s1-doors or push-traversable s1-ways which are easy to occupy or vacate as free space O-squares in RPR(T). Even then it isn't clear how to modify free space from RFS to take account of detected s1-ways; adding the detected O-squares to free space before applying RFS is one option, if we keep the true free space (within a single corral) separate. We'd have to look at the free space arrangements with and without occupied O-squares to be able to still apply the single-corral condition. Variations to occupy and vacate s1-ways where possible with methods similar to RPR(T) are needed. A restriction to reversibly push-traversable s1-ways would be a start so that only W-exiting deadlocks are a problem: which can be solved with a free space external to W and reversibly push-routable to an end of W. This can generalize to push-traversable s1-ways, with a free space outside W pushable to the first O-square and one pushable from the last O-square.

Dynamic 1-doors, 1-ways and 1-access


We can generalize s1-doors to the dynamic case (d1-doors: arbitrary positions and some access restrictions due to boxes other than O-boxes) by deleting (but keeping track of) boxes on the squares C,O,CP,OP,E and EP before applying the s1-door conditions to the resulting position rather than the maze. Similarly with d1-ways (without additional deletions). We still won't detect all clogged-up d1-ways with extra boxes blocking the access between one d1-door and the next (deleting all boxes outside the two sets of six defining d1-squares won't work in general as we could be deleting access barriers required in the definition, but it will detect some more d1-ways) or even some d1-doors if extra boxes block the access requirements. Where an underlying d1-way is detected only with surplus boxes deleted, it creates a significant problem: how to remove the surplus boxes. Also there is the dilemma of whether it is worthwhile to do so since that would be occupying free space which is currently unused, for the marginal benefit of 1-access. Perhaps a straight generalization without box deletion is best so we can apply the methods previously described (specifically, 1-access for push routes) in the static case. Even so, detection of d1-ways (potentially with four complex checks for each non-wall square in the puzzle) could be a significant overhead we don't want at nodes in the RPR tree searches, so we might want to restrict it to the start or solved positions, if using it at all. Also, boxes determining dynamic access restrictions required for a d1-way could get moved during SOLVE without the program "knowing" - though that just makes things easier so perhaps it can be safely ignored.

Future development and problems


The biggest weakness with RPR is probably the arrangements of free squares found by RFS. Better arrangements, packing more boxes into free space, may well exist than are found, and for more difficult puzzles they may be essential to the solution. I have no workable generally-applicable alternatives at present. The next problem is that RPR relies on free squares in the first place; more difficult puzzles may require navigable positions with multiple corrals. A step towards dealing with this extra complexity is to identify and use static one-way systems - those inherent in the arrangement of rooms in the maze, as discussed earlier.

This brings me to another problem with the RFS free arrangements: in the last stage, where circuits are removed, there is an awful lot of choice of ways to do this (much of which might be relatively unimportant in terms of the number of boxes packed into free space). However, some of the choices (due to convenient ordering) are likely to facilitate shuffling better than others, since the free space squares which cut down the circuits are introducing steadily more access restrictions. Again I have no workable alternatives at present, other than the obvious: restrict the number of candidates for free squares to one within straight tunnel sections.

Another way to introduce extra corrals is to generalize the notion of free squares to include "doors" for each extra corral. When checking a square for the free square list, usually we skip it if it increases the number of corrals. But we could check if it creates just one more corral and the square the box was pulled from initially still belongs to the man access, and we have access to the opposite pull square from the last one used in the pull route. We could allow only one such "door" per extra corral, or allow an unlimited number (perhaps setting up significant problems in later matching the reverse and forward search). The idea is that we block off an area by closing the "door", but it is possible to open the door by pulling in the opposite direction later to make the arrangement accessible again. Perhaps this idea should be combined with reversibility : we require the door box to be pullable back to the goal square it started from (at the time of setting up the door, and regardless of which pull square we use to return it beyond it being an accessible one).

A third way is to introduce some exhaustive searching for small rooms where we calculate all the different arrangements. We'd have to check the arrangements were achievable somehow, perhaps by successively peeling off "doors" until we had a single corral, and then put the "best" achievable arrangement (most boxes) into RPR.

A possible way to improve free space arrangements is via "deformation" of corrals. Consider boxes on the boundary of the given man corral with a push square in a neighbouring external corral. If pushing such a box away from the man corral is possible without increasing the # of corrals (or a detectable deadlock) we can do this to open up more space in the man corral. We can do this for all suitable boxes on the boundary. Of course, this extra space comes at the cost of restricting free space in neighbouring corrals. Suppose we've applied RPRT in the forward direction and only have partial success in occupying free space and reducing external corrals. Wherever we had the most success in occupying free space, we try deforming outward (pushing) those corral(s) at the cost of neighbours with less occupied free space. I don't have a detailed algorithm but envisage adding deformation pushes as an intermediate between push routes to free squares and exhaustive search. Hopefully we can add to free space in the successful corral(s) and again hopefully that will increase the occupied free space.

Another problem with the the RPR(T) free space is it is calculated at one time for a single position (any solved position in reverse and SP in forward). As variations are applied occupying some of these a priori free space squares, the position will change dramatically with respect to the SBL boxes, some of which may have moved by a sequence of single pulls. This may open up new dynamic free space squares not calculated a priori. For example, in a classic puzzle with many goals packed tightly into a single goal area, pulling out a number of boxes might open previously occupied space in the goal area, which is unnoticed by RPR(T), but might be needed for a solution. We could simply tack on a new free space calculation at the end of RPR(T) and run it again at the end of the previous variation to utilise the goal area space. That may work in some cases but doesn't address the issue of dynamic free space squares opening up fleetingly at any stage of the reverse variation, which may need to be used in more complex puzzles. Trying to deal with this in general by adding all dynamic free space squares in with separate subtasks risks subtasks with partial success multiplying without limit and defeating the purpose of limited search.

Note that the search is in the following fixed sequence (sketch of algorithm SOLVE): (a) open up to single corral with both forward and reverse variations. (b) apply reverse stacking order to find dynamic reversible region box # restrictions and relocate to match #s. (c) relocate to match forward and reverse variations into a solution. Obviously not all puzzles will divide up into this sequence neatly. Apart from one-way systems requiring external corrals, in practice the puzzle may require partial opening up, relocation, closing up, stacking and other variations in arbitrary sequence. This risks subtasks with partial success multiplying without limit and defeating the purpose of limited search.

Another problem is unmatchable forward and reverse variations (which may be related to more subtle dynamic restrictions than I have described). For example, as a human solver and setter I've often found forward and reverse variations which match exactly, except either the man is in the wrong corral or just one box is displaced one square, but requires a pull rather than a push in forward mode to match. Sometimes a little more work to the end of either forward or reverse variation produces a solution but often a completely new approach is necessary, suggesting the forward and reverse variations may be "unmatchable". Perhaps some form of machine learning is required to deal with more difficult puzzles, or perhaps progress can be made with a relatively simple parallel search of multiple subtasks with some connecting heuristics.

Personal tools