Heiner's solver
From Sokoban Wiki
I'm Heiner, and have written a "brute force" sokoban solver. Here I want to tell about it.
Contents |
How it started
The "initial ignition" was an idea about "deadlocks", i.e. their detection. Somewhere I had seen a hand written list of small patterns, used for deadlock detection. That appeared to me to be the wrong way. And I had an idea how to compute a certain class of deadlocks from basic principles. So I started to write a simple sokoban solver in Tcl. That was in 2007.
Later (still in 2007) I changed to ANSI-C, since Tcl wasn't fast enough, any more, and also used too much memory... my method needs large amounts of memory.
But the Tcl prototype was a good exercise to prepare the C program.
Legend
The terms I currently use, may be a bit non-standard. I found this Wiki in late 2010, and start to recognize that others use different words. While I have not switched, this is what I used up to now:
- actor: the moving "man", the warehouse keeper, the "
@
" - packet: the moved objects (box), the "
$
" - harbor: the packet destinations (goal), the "
.
" - configuration: a complete description of a situation: location of actor and of all the packets. What cannot be changed by "playing the game", is not considered part of a configuration, but is stored elsewhere, e.g. walls and harbor locations.
The basic solver method
I use the most simple method, not specific to sokoban. We have a starting configuration, a terminal configuration, and a definition of legal moves, which allows writing a function which maps any configuration to the set of legal successors.
This setup is enough for writing a simple puzzle solver, that searches for the shortest solution path.
We have a collection of pairs (configuration, distance), e.g. implemented as a hash table. We enter the pair (start-config, 0). Then, for increasing distance d, from all entries with distance d we generate all immediate successors, and those, which are not yet known, are entered with distance d+1. Each new entry is checked, whether it is the seeked for terminal configuration.
Once the terminal configuration is entered, we can stop the above, and extract a shortest solution path, starting with the terminal configuration, and searching for an immediate predecessor with a distance 1 smaller, and so on.
What is a "move"
I choose a "move" to be an arbitrary actor walk followed by exactly 1 push action. That way I generate push-optimal solutions. I could have chosen to make every small step of the actor a "move", and would get move-optimal solutions. The drawback would be the need of much higher distances, and to generate many more configurations.
Counting pushes instead of moves appeared to me to be the most natural measure. Your mileage may vary.
Solving forward
The above description is the forward method. It is the most natural thing to do, but it is not the only way, not even the best one...
Solving backwards
The attentive reader already noted, that the solution path extraction already needs an inverse move generator, generating immediate predecessor configurations. For sokoban this is not really hard to write.
Why shouldn't we start with the terminal configuration, and generate their predecessors, until we find the initial configuration? Nothing can stop us, it is a perfect mirror image of the above method, except... we do not generate the exact same set of configurations.
I did some experiments, and learned, that often the backward method generated less configurations than the forward method, but not always.
While contemplating how to decide for the best of both, I found a third method...
Solving mixed
In a sense this is very simple. We start solving forward and backwards, in separate collections (hash tables), one distance step at a time. And we check the new entries in one collection also against the other collection, and when we find a match, we have a solution, of minimal total distance.
Since these search trees tend to have much more entries for the larger distances, we now have used considerably less memory. That is quite important, since our method is clearly memory bound.
We still can improve this: instead of using a fixed alternation between forward and backward steps, we can do the next distance increasing step on that collection, where the front (the entries with the maximal distance) is smaller. The size of the front is a good (relative) estimate of the next front.
The extraction of a shortest solution path is not exactly trivial anymore,
but it is not really hard, either.
Details are left as an exercise ;-)
That is what I currently use (Dec-2010). Maybe everybody knew this already, but for me it was a genuine invention.
Computing deadlocks
Up to now, we have used very little knowledge about sokoban. We have used the rules by writing 2 functions, computing immediate successor and predecessor configurations.
Now we use another fact / observation: when we remove a packet from a configuration, and generate successor or predecessor configurations, we
- loose all those configurations, which moved the removed packet
- but we do still have all other configurations (now with one packet less)
- and we may get some more configurations, which were not legal before removing that packet.
The important point is 2. In the game Atomix e.g. it would not be valid, but in Sokoban the other packets only can hurt the movement of a packet, they cannot help to find more legal movements.
Computing with a reduced set of packets we get a kind of upper bound for what is possible.
Let us make an example: say, we start with any packet from the starting configuration, but only with that single packet, and generate all successors, and their successors and so on, until we get no more configurations, then we have a collection of all possible packet locations. The full game will never be able to push a packet to any location not found in this collection.
So what? Read on...
Let us do the same in the opposite direction, backwards, starting with single packets on any of the harbors.
Now we have a collection of reachable, and a collection of solvable configurations. Only those, which are in both these collections have a chance to be part of a solution. All others are bad and should be avoided.
If we remember the results of this computation, we can check every generated new configuration, whether it contains any packet on a location, which is "bad", and immediately forget about it.
Dead locations
What I have described above, is a way to recognize all "dead locations", i.e. locations which are immediate traps for a packet... and some not so obvious ones.
Instead of hard coding some special rules to find "dead locations" we have computed them from basic principles, what esthetically pleases me.
Generalizing to more packets
The above method can be generalized to using k of the packets, instead of just 1. I named this the k-packet-game or k-PG for short.
- Generate all subsets of cardinality k of the initial packets (actor as initial)
- Completely "solve" this forwards, do not stop until no more legal configurations can be generated
- Generate all subsets of cardinality k of the harbors as packets (all possible actor locations)
- Completely "solve" this backwards, do not stop until no more legal configurations can be generated
- Compute a compact representation of all configurations, which are in one of theses collections, but not in the other
Then, use this compact data to check every generated configuration with k or more packets, whether it contains such a subset of k packets, and ignore it, as it cannot possibly be part of any solution.
When playing the k+1-packet-game, we already use the results of the k-packet-game. That can help to reduce the total amount of memory, sometimes significantly.
Checking for subsets against a large amount of such candidates is a bit tricky to do efficiently.
With increasing k this becomes increasingly memory demanding. And for some levels the results are not so good, but for some levels this makes a huge difference.
What else?
Well, now you know most secrets about my sokoban solver. Of course, I have lots of ideas, what I still want to do. Some examples are:
- using bipartite matching over the reachability results from the 1-PG
- implementing less memory demanding encodings of the "collections" of (configuration,distance) pairs
- tunnel macros
- symmetry
- compute move-optimal solution inside push-optimal options