Sokoban solver "scribbles" by Florent Diedler about the Sokolution solver

From Sokoban Wiki

(Difference between revisions)

Revision as of 22:45, 12 March 2019

"Scribbles" about Sokoban solver programming by Florent Diedler, in particular about the Sokolution solver.

1 Sokolution overall structure
2 Search algorithms
3 Heuristic score
4 Heuristic update
5 Transposition table and hash function
6 Open queue
7 Goal room packing
8 Goal partial ordering
9 Pre-calculated deadlock positions
10 Pre-calculated penalty positions
11 Pre-calculated deadlock in the goal area
12 Macro moves
13 Dynamic searching for deadlocks
14 PI-Corral pruning
15 Conclusion

Sokolution overall structure

Sokolution is written in C++, use STL library / C++14 concepts and build with MinGW 5.1 under WIndows 7 / 10. I choose C++ because it is a low-level language really fast and we can have the full control of the memory thanks to pointers.

The only limit for my solver is the number of floor squares for a given level. Sokolution can solve *only* levels wih less than 256 floor squares (walls does not count). This implies that the maximum number of boxes is 255. Then, Sokolution can solve levels up to 100 for width and 100 for height. I choose this compromise because it is a limitation of bitsets, a binary structure in C++. It also allows to store all positions in only one byte and that is great for the memory.

Sokolution is compatible with host programs like Sokoban++ or YASC. The plugin version is just a DLL file that we need to place in the "Plugin" folder of the host program. There is no external dependencies.

Sokolution is the best solver because everything is really optimized in memory and for speed. It took for me almost one year to profile and optimize all bottlenecks. All areas are "binarized" that means each square is represented as a bit in memory. Because I use one byte to store the player position, the position can only vary from 0 to 255 included. This is the main drawback of my solver but that seems worthwhile.

Search algorithms

Sokolution can use several algorithms according to your needs :

Optimal algorithms : BFS, IDDFS, A* and IDA
Non-optimal algorithms : Greedy, DFS

Each algorithm can use the FORWARD mode or the BACKWARD mode and some special modes can be used.

The forward mode consists to start from the initial state and try to find a sequence of pushes to put all boxes on goals. This is the natural mode for solving Sokoban problems.

The backward mode is the opposite of the forward mode. We start from the solution (all boxes are on goals) and we pull boxes in order to find the initial position. Note that the final player position should be able to reach its initial position.

The bidirectional mode is a combination of forward and backward modes. A thread will perform a forward search and another thread will perform the backward search. There is no check when the two searches overlaps.

By default, Sokolution uses a bidirectional search with greedy algorithm in both side. This is the best combination for finding a solution quickly.

If you want to find optimal solutions, use a bidirectional search with A* algorithm in both side.

Heuristic score

All algorithms that needs an heuristic use the bipartite matching. It is a long computation that uses the Hungarian algorithm in O(n^4) in time complexity. The Hunguarian has been modified a bit in order to be optimized in O(n^3).

For solving sublevels, the heuristic uses the nearest goal for each box and it is really fast.

Heuristic update

Assuming that the basic heuristic is enough for solving a Sokoban level is an error. Even if we use a really accurate method the find a perfect bi-partite matching between boxes and goals, there are lots of situations where the basic heuristic is not sufficient. The basic heuristic is defined as the heuristic that takes no other boxes into account.

During the main search, we will find new frozen boxes on goals (otherwise it is a deadlock). Sometimes, these areas may change the heuristic score and affect the reachability of unfilled goals. To solve this problem, I recompute the heuristic after finding a frozen area. It is a long computation and we have to be smart to optimize this computation.

With this enhancement, just computing the heuristic score can lead to detect new deadlocks due to a goal that is no more accessible. It helps a lot when we have no goal ordering available.

Transposition table and hash function

A custom transposition table is created specialy for the solver. It is an open-addressing hashtable with linear probing. The memory is fully controled in order to set a memory limit for this solver without the risk of having an overflow memory.

After having implement plenty of hash functions, I finally use the default hash function for bitset and I do a XOR with the player normalized position.

Open queue

Depending of the algorithm choosen, the open queue can be :

FIFO queue for BFS
LIFO stack for DFS and IDDFS
Priority queue for Greedy, A* and IDA

The tie-breaking for the priority queue is :

If a F-cost is defined, sort by minimum F-cost
If a H-cost is defined, sort by minimum H-cost
For same F or H costs, sort by newest node first

NB : F-cost is defined as the sum of G-cost and H-cost (F = G + H)

Goal room packing

Contrary to YASS solver, Sokolution does not use a "goal room packing" method. So it is impossible to solve XSokoban #50, XSokoban #66 and Xsokoban #69 levels that needed these powerful algorithms. I think YASS shines in this domain.

However, Sokolution has an algorithm to find goal room when there is a single entrance (like many of XSokoban levels). If we are able to find an order of filling goal without created deadlocks, Sokolution will create macro moves to respect the ordering of the goal area. This mechanism speed up a lot levels with a "goal room theme".

Goal partial ordering

I think it is a real improvement for finding a non-optimal solution. It is quite impossible to guarantee optimality with a pre-calculated partial goal ordering. The aims is to find a subset of goals that we should always ordered before other goals. This subset of goals need also to be filled in a particular order.

Let's take the XSokoban #35 level :

First of all, there is no goal room in this level even if all goals are connected because there are two entrances. However, we know that we ought to fill the subset of goals (red rectangle) before other goals, otherwise we will create deadlocks due to inaccessible goals.

During the main search, we will try to find macro moves in order to push a box into its final goal position which respects the partial ordering. When the subset of goals is filled, left goals are filled without any particular order.

The main drawback of this algorithm is that it does not work with a goal area that the player can cross like in XSokoban #74 because we need to park boxes or cross the goal area before pushing boxes to their final position. It also does not work for scattered goals so it is not a good candidate for backward search. For these kind of levels, we have no choice and we should disabled the ordering engine.

Pre-calculated deadlock positions

There is a mechanism able to generate deadlock positions before starting the search. I called it a MPDB-X for Multiple Pattern DataBase for X boxes. The MPDB is a database that can store all deadlocks created by X boxes.

In fact, it will be too cosly to generate all deadlocks for X > 2. Moreover, nothing guarantees that these deadlocks are useful for the current level. So, it is better to find deadlocks dynamically during the search and store them in a database.

Pre-calculated penalty positions

There is also a mechanism able to generate positions that create penalties in term of heuristic cost before starting the search. I call it a MPPDB-X for Multiple Pattern Penalty DataBase for X boxes. The MPPDB is a database that can store all penalties created by X boxes.

Like for MPPDB-X, it is too cosly to generate all penalties for X > 2. Searching penalties are slower than searching deadlocks because we need to perform an optimal search for each pattern. For small patterns, we can use a simple BFS search that is quick enough.

There is no dynamic detection of penalties during the main search.

Note that a MPPDB-2 can get all linear conflict created by 2 boxes.

Pre-calculated deadlock in the goal area

This feature is under development and I hope will evolve some day.

I really want to create a MPGADB for Multiple Pattern Goal Area DataBase but it is quite difficult. After seeing most of deadlock situations near or inside the goal area, I have noticed some goals become inaccessible due to frozen boxes on goals. The heuristic can find a few of these situations like in this example :

But in this example :

The heuristic does not see that the situation is already a deadlock even if you can assign all boxes on a goal with the Hungarian method. In fact, the two red goals cannot be filled at the same time. It is a kind of bipartite deadlock but applied to goals.

The idea to solve these situations is to pre-compute that I called "a deadlock in the goal area with conditions".

Let's take the previous example and considering this position, the goals X and Y cannot be filled together because when the goal X will be filled , the goal Y will be inaccessible and in return if the goal Y is filled, the goal X will be inaccessible. As sais before, it is a sort of "bipartite" deadlock.

To detect these situations, we consider all frozen boxes on goal as walls. We start from a situation where the goal area is full and we try to escape boxes in backward mode (pulling). All boxes that cannot escape the goal area have a trouble and created the famous "condition".

Now, we can store an entry in the MPGADB {goal area considered as wall, conditions}

When we check the database, we need to check if the current goal area match with an entry and if all goals in condition are filled. If one of them is not filled, we have a deadlock !

The downside of this database is the cost of computation. As you can guess, it is very very long to pre-compute the database.

An idea can be to detect these situation during the search as for deadlocks. But I don't really know for now...

Macro moves

There are 3 kind of macro moves implemented in Sokolution :

Macro Tunnel : Tunnels are precomputed before the search. We have tunnels called "one way" and "two way". One way tunnels are composed of articulation squares. An articulation square is a square that splits the maze into several parts if it is remplaced by a wall. There are good candidate for PI-Corral pruning. Two way tunnels are tunnels where the player can go around it. These tunnels can be used to park temporary a box. When a box enters into a Tunnel, we can perform the macro move depending of the type and length of the tunnel. Macro tunnels does not affect the optimality of the solution.

Macro Goal : Single goal room area are precomputed before the search. When a box enters into a goal room area by its single entrance, we can push this box to its final goal position. Note that we can miss the optimality of the solution with this macro move because sometimes we need to park a box inside the goal room area in order to have the best solution.

Macro Goal Cut : When a partial ordering is available, we will generate macro moves in addition to other classical moves. The aim is to try to push a box directly to its final goal position according to the current ordering. Note that other boxes are considered as walls. These moves will have a better cost than regular moves, so they will be expanded first. For example, let's consider we have already ordered 2 goals. Before expanding childs, we will check for macro moves in order to fill the third goal and so on.

Sokoban solver "scribbles" by Florent Diedler about the Sokolution solver

From Sokoban Wiki

Revision as of 22:45, 12 March 2019

Contents

Sokolution overall structure

Search algorithms

Heuristic score

Heuristic update

Transposition table and hash function

Open queue

Goal room packing

Goal partial ordering

Pre-calculated deadlock positions

Pre-calculated penalty positions

Pre-calculated deadlock in the goal area

Macro moves

Dynamic searching for deadlocks

PI-Corral pruning

Conclusion

Views

Personal tools

Navigation

Search

Toolbox

@@ Line 73: / Line 73: @@
 == Goal partial ordering ==
-I think it is a real improvement for finding a non-optimal solution. It is quite impossible to guarantee optimality with a pre-calculated partial goal ordering.
+I think it is a real improvement for finding a non-optimal solution. It is quite impossible to guarantee optimality with a pre-calculated partial goal ordering. The aims is to find a subset of goals that we should always ordered before other goals. This subset of goals need also to be filled in a particular order.
-The aims is to find a subset of goals that we should always ordered before other goals. This subset of goals need also to be filled in a particular order.
+Let's take the XSokoban #35 level :
+[[ File:Xsokoban 35.png|200px ]]
+First of all, there is no goal room in this level even if all goals are connected because there are two entrances. However, we know that we ought to fill the subset of goals (red rectangle) before other goals, otherwise we will create deadlocks due to inaccessible goals.
+[[ File:Xsokoban 35 partial ordering.png|200px ]]
+During the main search, we will try to find macro moves in order to push a box into its final goal position '''which respects the partial ordering'''. When the subset of goals is filled, left goals are filled without any particular order.
+The main drawback of this algorithm is that it does not work with a goal area that the player can cross like in XSokoban #74 because we need to park boxes or cross the goal area before pushing boxes to their final position. It also does not work for scattered goals so it is not a good candidate for backward search.  For these kind of levels, we have no choice and we should disabled the ordering engine.
 == Pre-calculated deadlock positions ==
+There is a mechanism able to generate deadlock positions before starting the search. I called it a MPDB-X for Multiple Pattern DataBase for X boxes. The MPDB is a database that can store all deadlocks created by X boxes.
+In fact, it will be too cosly to generate all deadlocks for X > 2. Moreover, nothing guarantees that these deadlocks are useful for the current level. So, it is better to find deadlocks dynamically during the search and store them in a database.
 == Pre-calculated penalty positions ==
+There is also a mechanism able to generate positions that create penalties in term of heuristic cost before starting the search. I call it a MPPDB-X for Multiple Pattern Penalty DataBase for X boxes. The MPPDB is a database that can store all penalties created by X boxes.
+Like for MPPDB-X, it is too cosly to generate all penalties for X > 2. Searching penalties are slower than searching deadlocks because we need to perform an optimal search for each pattern. For small patterns, we can use a simple BFS search that is quick enough.
+There is no dynamic detection of penalties during the main search.
+Note that a MPPDB-2 can get all linear conflict created by 2 boxes.
 == Pre-calculated deadlock in the goal area ==
+This feature is under development and I hope will evolve some day.
+I really want to create a MPGADB for Multiple Pattern Goal Area DataBase but it is quite difficult. After seeing most of deadlock situations near or inside the goal area, I have noticed some goals become inaccessible due to frozen boxes on goals. The heuristic can find a few of these situations like in this example :
+[[ File:Xsokoban 16 frozen heuristic.png|200px ]]
+But in this example :
+[[ File:Xsokoban 45 deadlock.png|200px ]]
+The heuristic does not see that the situation is already a deadlock even if you can assign all boxes on a goal with the Hungarian method. In fact, the two red goals cannot be filled at the same time. It is a kind of bipartite deadlock but applied to goals.
+The idea to solve these situations is to pre-compute that I called "a deadlock in the goal area with conditions".
+Let's take the previous example and considering this position, the goals X and Y cannot be filled together because when the goal X will be filled , the goal Y will be inaccessible and in return if the goal Y is filled, the goal X will be inaccessible. As sais before, it is a sort of "bipartite" deadlock.
+To detect these situations, we consider all frozen boxes on goal as walls. We start from a situation where the goal area is full and we try to escape boxes in backward mode (pulling). All boxes that cannot escape the goal area have a trouble and created the famous "condition".
+Now, we can store an entry in the MPGADB {goal area considered as wall, conditions}
+When we check the database, we need to check if the current goal area match with an entry and if all goals in condition are filled. If one of them is not filled, we have a deadlock !
+The downside of this database is the cost of computation. As you can guess, it is very very long to pre-compute the database.
+An idea can be to detect these situation during the search as for deadlocks. But I don't really know for now...
 == Macro moves ==
+There are 3 kind of macro moves implemented in Sokolution :
+* Macro Tunnel : Tunnels are precomputed before the search. We have tunnels called "one way" and "two way". One way tunnels are composed of articulation squares. An articulation square is a square that splits the maze into several parts if it is remplaced by a wall. There are good candidate for PI-Corral pruning. Two way tunnels are tunnels where the player can go around it. These tunnels can be used to park temporary a box. When a box enters into a Tunnel, we can perform the macro move depending of the type and length of the tunnel. Macro tunnels does not affect the optimality of the solution.
+* Macro Goal : Single goal room area are precomputed before the search. When a box enters into a goal room area by its single entrance, we can push this box to its final goal position. Note that we can miss the optimality of the solution with this macro move because sometimes we need to park a box inside the goal room area in order to have the best solution.
+* Macro Goal Cut : When a partial ordering is available, we will generate macro moves in addition to other classical moves. The aim is to try to push a box directly to its final goal position according to the current ordering. Note that other boxes are considered as walls. These moves will have a better cost than regular moves, so they will be expanded first. For example, let's consider we have already ordered 2 goals. Before expanding childs, we will check for macro moves in order to fill the third goal and so on.
 == Dynamic searching for deadlocks ==