Sokoban solver "scribbles" by Florent Diedler about the Sokolution solver
From Sokoban Wiki
"Scribbles" about Sokoban solver programming by Florent Diedler, in particular about the Sokolution solver.
Sokolution overall structure
Sokolution is written in C++, uses STL library / C++14 concepts and is built with MinGW 5.1 under Windows 7 / 10. I chose C++ because it is a very fast low-level language really fast and we can have the full control of the memory thanks to the pointers.
The only limit of my solver is the number of floor squares for a given level. Sokolution can solve *only* levels with less than 256 floor squares (walls do not count). This implies that the maximum number of boxes is 255. Then, Sokolution can solve levels up to 100 for width and 100 for height. I choose this compromise because it is a limitation of bitsets, a binary structure in C++. It also allows all positions to be stored in only one byte which is great for the memory.
Sokolution is compatible with host programs like Sokoban++ or YASC. The plugin version is just a DLL file that we need to place into the "Plugin" folder of the host program. There are no external dependencies.
Sokolution is the best solver because everything is really optimized in memory and speed. I spent almost a year to profile and optimize all bottlenecks. All areas are "binarized" which means that each square is represented as a bit in memory. Because I use one byte to store the player position, the position can only vary from 0 to 255 included. This is the main drawback of my solver but it seems nonetheless worthwhile.
Search algorithms
Sokolution can use several algorithms according to your needs :
- Optimal algorithms : BFS, IDDFS, A* and IDA
- Non-optimal algorithms : Greedy, DFS
Each algorithm can use the FORWARD mode or the BACKWARD mode and some special modes can be used.
The forward mode consists to start from the initial state and try to find a sequence of pushes to put all boxes on goals. This is the natural mode to solve Sokoban problems.
The backward mode is the opposite of the forward mode. We start from the solution (all boxes are on goals) and we pull boxes in order to find the initial position. Note that the final player position should be able to reach its initial position.
The bidirectional mode is a combination of forward and backward modes. A thread will perform the forward search and another thread will perform the backward search. There is no check when the two searches overlap.
By default, Sokolution uses a bidirectional search with greedy algorithm in both side. This is the quickest way to find a solution.
If you want to find optimal solutions, use a bidirectional search with A* algorithm in both side.
Heuristic score
All algorithms that require a heuristic use the bipartite matching. It is a long computation which uses the Hungarian algorithm in O(n^4) in time complexity. The Hungarian one has been modified a bit in order to be optimized in O(n^3).
To solve sublevels, the heuristic uses the nearest goal for each box which is really fast.
Heuristic update
Assuming that the basic heuristic is sufficient to solve a Sokoban level is an error. Even if we use a really accurate method to find a perfect bipartite matching between boxes and goals, there are lots of situations where the basic heuristic is not sufficient. The basic heuristic is defined as the heuristic that does not take into account any other boxes.
During the main search, we will find new frozen boxes on goals (otherwise it is a deadlock). Sometimes, these areas may change the heuristic score and affect the reachability of unfilled goals. To solve this problem, I recompute the heuristic after finding a frozen area. It is a long computation and we have to be smart to optimize this computation.
With this enhancement, just computing the heuristic score can lead to detect new deadlocks due to a goal that is no more accessible. It helps a lot when we have no goal ordering available.
Transposition table and hash function
A custom transposition table is created especially for the solver. It is an open-addressing hashtable with linear probing. The memory is fully controlled in order to set a memory limit for this solver without the risk of having an overflow memory.
After having implemented plenty of hash functions, I finally use the default hash function for bitset and I do a XOR with the player normalized position.
Open queue
Depending of the algorithm chosen, the open queue can be :
- FIFO queue for BFS
- LIFO stack for DFS and IDDFS
- Priority queue for Greedy, A* and IDA
The tie-breaking for the priority queue is :
- If a F-cost is defined, sort by minimum F-cost
- If a H-cost is defined, sort by minimum H-cost
- For same F or H costs, sort by newest node first
NB : F-cost is defined as the sum of G-cost and H-cost (F = G + H)
Goal room packing
Contrary to YASS solver, Sokolution does not use a "goal room packing" method. So it is impossible to solve XSokoban #50, XSokoban #66 and Xsokoban #69 levels that needed those powerful algorithms. I think YASS shines in this domain.
However, Sokolution has an algorithm to find goal room when there is a single entrance (like many of XSokoban levels). If we are able to find an order of filling goal without creating deadlocks, Sokolution will create macro moves to respect the ordering of the goal area. This mechanism speeds up a lot levels with a "goal room theme".
Partial goal ordering
I think this is a real improvement in finding a non-optimal solution. It is quite impossible to guarantee optimality with a pre-calculated partial goal ordering. The aims is to find a subset of goals that should always be ordered before other goals. This subset of goals need also to be filled in a particular order.
Let's take the XSokoban #35 level as example :
First of all, there is no goal room in this level even if all goals are connected because there are two entrances. However, we know that we ought to fill the subset of goals (red rectangle) before any other goal. Otherwise we will create deadlocks due to inaccessible goals.
During the main search, we try to find macro moves in order to push a box into its final goal position which respects the partial ordering. When the subset of goals is filled, left goals are filled without any particular order.
The main drawback of this algorithm is that it does not work with a goal area that the player can cross like in XSokoban #74 because we need to park boxes or cross the goal area before pushing boxes to their final position. It also does not work for scattered goals so it is not a good candidate for backward search. For those kinds of levels, we have no choice but to disable the ordering engine.
Pre-calculated deadlock positions
There is a mechanism able to generate deadlock positions before starting the search. I called it a MPDB-X for Multiple Pattern DataBase for X boxes. The MPDB is a database that can store all deadlocks created by X boxes.
However, it is too costly to generate all deadlocks for X > 2. Moreover, nothing guarantees that these deadlocks are useful for the current level. So, it is better to find deadlocks dynamically during the search and store them in a database.
Pre-calculated penalty positions
There is also a mechanism able to generate positions that creates penalties in term of heuristic cost before starting the search. I call it a MPPDB-X for Multiple Pattern Penalty DataBase for X boxes. The MPPDB is a database that can store all penalties created by X boxes.
Like the MPPDB-X, it is too costly to generate all penalties for X > 2. Searching penalties is slower than searching deadlocks because we need to perform an optimal search for each pattern. For small patterns, we can use a simple BFS search that is quick enough.
There is no dynamic detection of penalties during the main search.
Note that a MPPDB-2 can get all linear conflicts created by 2 boxes.
Pre-calculated deadlock in the goal area
This feature is under development and I hope it will evolve some day.
I really want to create a MPGADB for Multiple Pattern Goal Area DataBase but it is quite difficult. After seeing most of deadlock situations near or inside the goal area, I have noticed that some goals become inaccessible due to frozen boxes on goals. The heuristic can find a few of these situations like in this example :
But in this example :
The heuristic does not see that the situation is already a deadlock even if you can assign all boxes on a goal with the Hungarian method. In fact, the two red goals cannot be filled at the same time. It is a kind of bipartite deadlock but applied to goals.
The idea to solve these situations is to pre-compute what I call "a deadlock in the goal area with conditions".
Let's take the previous example. Considering this position, the goals X and Y cannot be filled together because when the goal X will be filled , the goal Y will be inaccessible and in return if the goal Y is filled, the goal X will be inaccessible. As said before, it is a sort of "bipartite" deadlock.
To detect these situations, we consider all frozen boxes on goal as walls. We start from a situation where the goal area is full and we try to escape boxes in backward mode (pulling). All boxes that cannot escape the goal area have a trouble and created the famous "condition".
Now, we can store an entry in the MPGADB {goal area considered as wall, conditions}
When we check the database, we need to check if the current goal area matches with an entry and if all goals in condition are filled. If one of them is not filled, we have a deadlock !
The downside of this database is the cost of computation. As you can guess, it is really long to pre-compute the database.
An idea can be to detect these situations during the search as for deadlocks. But I don't really know for now...
Macro moves
There are 3 kinds of macro moves implemented in Sokolution :
- Macro Tunnel : Tunnels are precomputed before the search. We have tunnels called "one-way" and "two-way". One way tunnels are composed of articulation squares. An articulation square is a square that splits the maze into several parts if it is replaced by a wall. There are good candidate for PI-Corral pruning. Two way tunnels are tunnels where the player can go around it. These tunnels can be used to park temporary a box. When a box enters into a Tunnel, we can perform the macro move depending of the type and length of the tunnel. The macro tunnel does not affect the optimality of the solution.
- Macro Goal : Single goal room area are precomputed before the search. When a box enters into a goal room area by its single entrance, we can push this box to its final goal position. Note that we can miss the optimality of the solution with this macro move because sometimes we need to park a box inside the goal room area in order to have the best solution.
- Macro Goal Cut : When a partial ordering is available, we will generate macro moves in addition to other classical moves. The aim is to try to push a box directly to its final goal position according to the current ordering. Note that other boxes are considered as walls. These moves will have a better cost than regular moves, so they will be expanded first. For example, let's consider we have already ordered 2 goals. Before expanding children, we will check for macro moves in order to fill the third goal and so on.
Dynamic searching of deadlocks
After studying all the literature about Sokoban, this wiki and seeing how the solver ran, I figure out that detecting all kind of deadlocks is barely impossible (without having a Super Calculator, maybe...).
Sokolution can detect all *simple* deadlocks described in this wiki.
During the search, Sokolution will look about corrals that are good candidates for deadlocks. When a deadlock is found, a process of minimization is launched in order to find the smallest pattern that creates the deadlock. It will allow being more efficient when finding deadlocks during the main search. These operations are really long and when patterns have too many boxes, it can take several hours to identify the deadlock (or just saying that this pattern is not a deadlock). To avoid this, we relax the rules to detect deadlocks when analyzing a pattern. If a box is not surrounded by other boxes, we can remove it from the pattern (it probably does not belongs to the deadlock). We can miss some deadlocks but it speed up considerably the search. As another improvement, we use a very fast heuristic here : the nearest goal for each boxes.
PI-Corral pruning
First of all, thanks to Brian Damgaard and Matthias Meger to have explained this marvellous concept in the wiki.
I will not explain this concept and prefer giving the link for those of you who are interested.
In the first versions (version 1.X), Sokolution has only a basic combined PI-Corral engine. In Sokolution 2.X the PI-Corral engine has been improved a bit and can detect more PI-Corral than the strict definition.
Sokolution 3.X has a brand new PI-Corral engine. It was really hard to develop it but I am proud of expose its features :
- The engine finds all simple corrals
- The engine finds combined corrals formed by two simple corrals
- The engine finds combined corrals formed by the player non-reachable area
Then, we analyze all corrals in order to find the best PI-Corral. The best PI-Corral is defined as :
- First, PI-Corral with the more boxes on goal
- Then, PI-Corral with the less boxes on its frontier
- Then, PI-Corral with the smallest area
There is often PI-Corral in the goal area that creates a deadlock because one or more goals will be not accessible after resolving this PI-Corral. That is why, we prefer taking PI-Corral with the more boxes on goals against the fewer boxes on its frontier.
Note: The PI-Corral pruning is only available in forward mode. There is no PO-Corral pruning in backward mode. I tried a lot of things but no result for now.
Pi-Corral enhancement
Searching Pi-Corral allows finding some useful information about the disposition of boxes on the board. Sometimes, we discover a PI-Corral with all frontier boxes on goals and with no legal move available.
Let's take the XSokoban #41 level :
In this level, red boxes form a valid PI-Corral. All boxes on its frontier are reachable for the player but we cannot open the corral because there are two dead-squares inside. What can we conclude ? It is clear that frontier boxes are frozen (on goals otherwise it is a deadlock). Thus, I can improve the heuristic score with this new frozen area.
To say that frontier boxes are frozen, we should be sure that the corral cannot be opened. We can guess two simple conditions :
- all inside corral squares are dead
- there is less than 5 squares inside the corral
Perhaps, we can find other conditions but they are so many things to do with Sokoban that we cannot do all at the same time !
Brainstorming and ideas
To go further and try to solve more difficult levels, I thought about some ideas that can be good to investigate :
- Performing a bidirectional search with frontier merging : The aim of this search is to perform a forward search in a thread and a backward search in another thread. The backward search try to find a node that belongs to the forward search. If we can find such a node, we have found a solution because forward and backward frontiers merged.
- Finding new deadlocks in the goal area. Instead of finding deadlocks outside the goal area, try to find deadlocks that create one or more inaccessible goals. It is the idea developed in the Pre-calculated deadlock in the goal area section of this article.
- Improve the backward search by finding something similar to PI-Corral. I tried a lot of things with no success.
- Improve the backward search by finding new kind of deadlocks.
- Implement a reordering of the goal area when possible. By example, the XSokoban #87 is a good candidate.
- Split a difficult level into small parts (divide and conquer !). By example, the Kenyam #20 is a good candidate because the center area is composed of deadsquares, so we can split the maze into 4 small parts that are really easy to solve.
- Improve the penalty engine by finding complex penalties that involve more than 2 boxes.
Conclusion
I worked on Sokolution since January 2013, the project is almost finished because I don't want to have a burn-out. It was really hard to be focused on it during the last 5 years working sometimes until 3 a.m to debug a new feature or waiting for the end of a bench. It was a real pleasure to work on this subject and exchange information with Brian and Matthias.
Even if I have a lack of self-confidence, I really don't realize that I have created the best solver of the world in the Sokoban world. I think I am proud of me even if very few people know this domain.
That's all ! I revealed all features inside Sokolution. I hope it will help other solvers to be more powerful and beat Sokolution :)