Sokoban solver "scribbles" by Florent Diedler about the Sokolution solver

From Sokoban Wiki

(Difference between revisions)

Revision as of 17:28, 23 March 2019

"Scribbles" about Sokoban solver programming by Florent Diedler, in particular about the Sokolution solver.

1 Sokolution overall structure
2 Search algorithms
3 Heuristic score
4 Heuristic update
5 Transposition table and hash function
6 Open queue
7 Goal room packing
8 Partial goal ordering
9 Pre-calculated deadlock positions
10 Pre-calculated penalty positions
11 Pre-calculated deadlock in the goal area
12 Macro moves
13 Dynamic searching of deadlocks
14 PI-Corral pruning
15 Brainstorming and ideas
16 Conclusion

Sokolution overall structure

Sokolution is written in C++, use STL library / C++14 concepts and build with MinGW 5.1 under Windows 7 / 10. I chose C++ because it is a low-level language really fast and we can have the full control of the memory thanks to pointers.

The only limit of my solver is the number of floor squares for a given level. Sokolution can solve *only* levels with less than 256 floor squares (walls do not count). This implies that the maximum number of boxes is 255. Then, Sokolution can solve levels up to 100 for width and 100 for height. I choose this compromise because it is a limitation of bitsets, a binary structure in C++. It also allows storing all positions in only one byte which is great for the memory.

Sokolution is compatible with host programs like Sokoban++ or YASC. The plugin version is just a DLL file that we need to place in the "Plugin" folder of the host program. There is no external dependencies.

Sokolution is the best solver because everything is really optimized in memory and for speed. I spent almost a year to profile and optimize all bottlenecks. All areas are "binarized" which means that each square is represented as a bit in memory. Because I use one byte to store the player position, the position can only vary from 0 to 255 included. This is the main drawback of my solver but it seems nonetheless worthwhile.

Search algorithms

Sokolution can use several algorithms according to your needs :

Optimal algorithms : BFS, IDDFS, A* and IDA
Non-optimal algorithms : Greedy, DFS

Each algorithm can use the FORWARD mode or the BACKWARD mode and some special modes can be used.

The forward mode consists to start from the initial state and try to find a sequence of pushes to put all boxes on goals. This is the natural mode to solve Sokoban problems.

The backward mode is the opposite of the forward mode. We start from the solution (all boxes are on goals) and we pull boxes in order to find the initial position. Note that the final player position should be able to reach its initial position.

The bidirectional mode is a combination of forward and backward modes. A thread will perform a forward search and another thread will perform the backward search. There is no check when the two searches overlap.

By default, Sokolution uses a bidirectional search with greedy algorithm in both side. This is the quickest way to find a solution.

If you want to find optimal solutions, use a bidirectional search with A* algorithm in both side.

Heuristic score

All algorithms that require a heuristic use the bipartite matching. It is a long computation which uses the Hungarian algorithm in O(n^4) in time complexity. The Hungarian has been modified a bit in order to be optimized in O(n^3).

To solve sublevels, the heuristic uses the nearest goal for each box which is really fast.

Heuristic update

Assuming that the basic heuristic is sufficient to solve a Sokoban level is an error. Even if we use a really accurate method to find a perfect bi-partite matching between boxes and goals, there are lots of situations where the basic heuristic is not sufficient. The basic heuristic is defined as the heuristic that takes no other boxes into account.

During the main search, we will find new frozen boxes on goals (otherwise it is a deadlock). Sometimes, these areas may change the heuristic score and affect the reachability of unfilled goals. To solve this problem, I recompute the heuristic after finding a frozen area. It is a long computation and we have to be smart to optimize this computation.

With this enhancement, just computing the heuristic score can lead to detect new deadlocks due to a goal that is no more accessible. It helps a lot when we have no goal ordering available.

Transposition table and hash function

A custom transposition table is created especially for the solver. It is an open-addressing hashtable with linear probing. The memory is fully controlled in order to set a memory limit for this solver without the risk of having an overflow memory.

After having implemented plenty of hash functions, I finally use the default hash function for bitset and I do a XOR with the player normalized position.

Open queue

Depending of the algorithm chosen, the open queue can be :

FIFO queue for BFS
LIFO stack for DFS and IDDFS
Priority queue for Greedy, A* and IDA

The tie-breaking for the priority queue is :

If a F-cost is defined, sort by minimum F-cost
If a H-cost is defined, sort by minimum H-cost
For same F or H costs, sort by newest node first

NB : F-cost is defined as the sum of G-cost and H-cost (F = G + H)

Goal room packing

Contrary to YASS solver, Sokolution does not use a "goal room packing" method. So it is impossible to solve XSokoban #50, XSokoban #66 and Xsokoban #69 levels that needed thsoe powerful algorithms. I think YASS shines in this domain.

However, Sokolution has an algorithm to find goal room when there is a single entrance (like many of XSokoban levels). If we are able to find an order of filling goal without created deadlocks, Sokolution will create macro moves to respect the ordering of the goal area. This mechanism speed up a lot levels with a "goal room theme".

Partial goal ordering

I think it is a real improvement for finding a non-optimal solution. It is quite impossible to guarantee optimality with a pre-calculated partial goal ordering. The aims is to find a subset of goals that should always be ordered before other goals. This subset of goals need also to be filled in a particular order.

Let's take the XSokoban #35 level as example :

First of all, there is no goal room in this level even if all goals are connected because there are two entrances. However, we know that we ought to fill the subset of goals (red rectangle) before any other goal. Otherwise we will create deadlocks due to inaccessible goals.

During the main search, we try to find macro moves in order to push a box into its final goal position which respects the partial ordering. When the subset of goals is filled, left goals are filled without any particular order.

The main drawback of this algorithm is that it does not work with a goal area that the player can cross like in XSokoban #74 because we need to park boxes or cross the goal area before pushing boxes to their final position. It also does not work for scattered goals so it is not a good candidate for backward search. For those kinds of levels, we have no choice but to disable the ordering engine.

Pre-calculated deadlock positions

There is a mechanism able to generate deadlock positions before starting the search. I called it a MPDB-X for Multiple Pattern DataBase for X boxes. The MPDB is a database that can store all deadlocks created by X boxes.

In fact, it is too costly to generate all deadlocks for X > 2. Moreover, nothing guarantees that these deadlocks are useful for the current level. So, it is better to find deadlocks dynamically during the search and store them in a database.

Pre-calculated penalty positions

There is also a mechanism able to generate positions that create penalties in term of heuristic cost before starting the search. I call it a MPPDB-X for Multiple Pattern Penalty DataBase for X boxes. The MPPDB is a database that can store all penalties created by X boxes.

Like for MPPDB-X, it is too costly to generate all penalties for X > 2. Searching penalties is slower than searching deadlocks because we need to perform an optimal search for each pattern. For small patterns, we can use a simple BFS search that is quick enough.

There is no dynamic detection of penalties during the main search.

Note that a MPPDB-2 can get all linear conflicts created by 2 boxes.

Pre-calculated deadlock in the goal area

This feature is under development and I hope it will evolve some day.

I really want to create a MPGADB for Multiple Pattern Goal Area DataBase but it is quite difficult. After seeing most of deadlock situations near or inside the goal area, I have noticed that some goals become inaccessible due to frozen boxes on goals. The heuristic can find a few of these situations like in this example :

But in this example :

The heuristic does not see that the situation is already a deadlock even if you can assign all boxes on a goal with the Hungarian method. In fact, the two red goals cannot be filled at the same time. It is a kind of bipartite deadlock but applied to goals.

The idea to solve these situations is to pre-compute what I call "a deadlock in the goal area with conditions".

Let's take the previous example. Considering this position, the goals X and Y cannot be filled together because when the goal X will be filled , the goal Y will be inaccessible and in return if the goal Y is filled, the goal X will be inaccessible. As said before, it is a sort of "bipartite" deadlock.

To detect these situations, we consider all frozen boxes on goal as walls. We start from a situation where the goal area is full and we try to escape boxes in backward mode (pulling). All boxes that cannot escape the goal area have a trouble and created the famous "condition".

Now, we can store an entry in the MPGADB {goal area considered as wall, conditions}

When we check the database, we need to check if the current goal area matches with an entry and if all goals in condition are filled. If one of them is not filled, we have a deadlock !

The downside of this database is the cost of computation. As you can guess, it is really long to pre-compute the database.

An idea can be to detect these situations during the search as for deadlocks. But I don't really know for now...

Macro moves

There are 3 kinds of macro moves implemented in Sokolution :

Macro Tunnel : Tunnels are precomputed before the search. We have tunnels called "one-way" and "two-way". One way tunnels are composed of articulation squares. An articulation square is a square that splits the maze into several parts if it is replaced by a wall. There are good candidate for PI-Corral pruning. Two way tunnels are tunnels where the player can go around it. These tunnels can be used to park temporary a box. When a box enters into a Tunnel, we can perform the macro move depending of the type and length of the tunnel. The macro tunnel does not affect the optimality of the solution.

Macro Goal : Single goal room area are precomputed before the search. When a box enters into a goal room area by its single entrance, we can push this box to its final goal position. Note that we can miss the optimality of the solution with this macro move because sometimes we need to park a box inside the goal room area in order to have the best solution.

Macro Goal Cut : When a partial ordering is available, we will generate macro moves in addition to other classical moves. The aim is to try to push a box directly to its final goal position according to the current ordering. Note that other boxes are considered as walls. These moves will have a better cost than regular moves, so they will be expanded first. For example, let's consider we have already ordered 2 goals. Before expanding children, we will check for macro moves in order to fill the third goal and so on.

Dynamic searching of deadlocks

After studying all the literature about Sokoban, this wiki and seeing how the solver ran, I figure out that detecting all kind of deadlocks is barely impossible (without having a Super Calculator, maybe...).

Sokolution can detect all *simple* deadlocks described in this wiki.

During the search, Sokolution will look about corrals that are good candidates for deadlocks. When a deadlock is found, a process of minimization is launched in order to find the smallest pattern that creates the deadlock. It will allow being more efficient when finding deadlocks during the main search. These operations are really long and when patterns have too many boxes, it can take several hours to identify the deadlock (or just saying that this pattern is not a deadlock). To avoid this, we relax the rules to detect deadlocks when analyzing a pattern. If a box is not surrounded by other boxes, we can remove it from the pattern (it probably does not belongs to the deadlock). We can miss some deadlocks but it speed up considerably the search. As another improvement, we use a very fast heuristic here : the nearest goal for each boxes.

PI-Corral pruning

First of all, thanks to Brian Damgaard and Matthias Meger to have explained this marvellous concept in the wiki.

I will not explain this concept and prefer giving the link for those of you who are interested.

In the first versions (version 1.X), Sokolution has only a basic combined PI-Corral engine. In Sokolution 2.X the PI-Corral engine has been improved a bit and can detect more PI-Corral than the strict definition.

Sokolution 3.X has a brand new PI-Corral engine. It was really hard to develop it but I am proud of expose its features :

The engine finds all simple corrals
The engine finds combined corrals formed by two simple corrals
The engine finds combined corrals formed by the player non-reachable area

Then, we analyze all corrals in order to find the best PI-Corral. The best PI-Corral is defined as :

First, PI-Corral with the more boxes on goal
Then, PI-Corral with the less boxes on its frontier
Then, PI-Corral with the smallest area

There is often PI-Corral in the goal area that creates a deadlock because one or more goals will be not accessible after resolving this PI-Corral. That is why, we prefer taking PI-Corral with the more boxes on goals against the fewer boxes on its frontier.

Note: The PI-Corral pruning is only available in forward mode. There is no PO-Corral pruning in backward mode. I tried a lot of things but no result for now.

Pi-Corral enhancement

Searching Pi-Corral allows finding some useful information about the disposition of boxes on the board. Sometimes, we discover a PI-Corral with all frontier boxes on goals and with no legal move available.

Let's take the XSokoban #41 level :

In this level, red boxes form a valid PI-Corral. All boxes on its frontier are reachable for the player but we cannot open the corral because there are two dead-squares inside. What can we conclude ? It is clear that frontier boxes are frozen (on goals otherwise it is a deadlock). Thus, I can improve the heuristic score with this new frozen area.

To say that frontier boxes are frozen, we should be sure that the corral cannot be opened. We can guess two simple conditions :

all inside corral squares are dead
there is less than 5 squares inside the corral

Perhaps, we can find other conditions but they are so many things to do with Sokoban that we cannot do all at the same time !

Brainstorming and ideas

To go further and try to solve more difficult levels, I thought about some ideas that can be good to investigate :

Performing a bidirectional search with frontier merging : The aim of this search is to perform a forward search in a thread and a backward search in another thread. The backward search try to find a node that belongs to the forward search. If we can find such a node, we have found a solution because forward and backward frontiers merged.
Finding new deadlocks in the goal area. Instead of finding deadlocks outside the goal area, try to find deadlocks that create one or more inaccessible goals. It is the idea developed in the Pre-calculated deadlock in the goal area section of this article.
Improve the backward search by finding something similar to PI-Corral. I tried a lot of things with no success.
Improve the backward search by finding new kind of deadlocks.
Implement a reordering of the goal area when possible. By example, the XSokoban #87 is a good candidate.
Split a difficult level into small parts (divide and conquer !). By example, the Kenyam #20 is a good candidate because the center area is composed of deadsquares, so we can split the maze into 4 small parts that are really easy to solve.
Improve the penalty engine by finding complex penalties that involve more than 2 boxes.

Conclusion

I worked on Sokolution since January 2013, the project is almost finished because I don't want to have a burn-out. It was really hard to be focused on it during the last 5 years working sometimes until 3 a.m to debug a new feature or waiting for the end of a bench. It was a real pleasure to work on this subject and exchange information with Brian and Matthias.

Even if I have a lack of self-confidence, I really don't realize that I have created the best solver of the world in the Sokoban world. I think I am proud of me even if very few people know this domain.

That's all ! I revealed all features inside Sokolution. I hope it will help other solvers to be more powerful and beat Sokolution :)