SolvePOMDP is a Java program that solves Partially Observable Markov Decision Processes (POMDPs). The program executes value iteration to find a POMDP solution and writes the solution to a file. The program contains both exact and approximate methods. This webpage contains all the information that is required to use SolvePOMDP, and it provides further references to related scientific literature and online resources.

  Features

The SolvePOMDP software contains the following key features:

  Download

We recommend to use one of the SolvePOMDP packages below, which already include an executable and all necessary files to start the solver. The current version of SolvePOMDP is 0.0.3.

  

If you would like to compile the program from source, please have a look at our repository.

  Getting started

The software can be executed on multiple operating systems. Please select your operating system below and follow the steps in order to solve your first POMDP model using SolvePOMDP.

The zip file contains all the files that are required to get started. The steps below will guide you through the setup and it explains how to run SolvePOMDP.

1. Directory structure

After extracting the zip archive you will find a domains directory, which includes POMDP descriptions based on Tony's POMDP file format. Your own POMDP domains can be added to this directory. When solving a POMDP model, the solver generates files in the output directory. This can be either in value function format, or in policy graph format. For more details about the file formats and example domains we refer to pomdp.org.

2. Configuration file

The solver.config file allows you to modify a few parameters of the POMDP solver. The documentation in this file explains the parameters and how to set them properly before executing SolvePOMDP. The default configuration should be appropriate in most cases. An important parameter is the algorithm that should be used, which can be either exact or approximate.

3. Linear programming solvers

POMDP algorithms require a linear programming solver during their execution. SolvePOMDP comes with two built-in linear programming solvers that can be used without any additional configuration. It also supports a commercial linear programming solver.

  • LPSolve (built-in) – A recent version of LPSolve has been included in SolvePOMDP. By default it uses the 32 bits version. If you want to use the 64 bits version, then it is required to modify the run script prior to executing it using a text editor (e.g., Notepad++). The architecture should match the architecture of your Java installation, rather than the operating system.
  • JOptimizer (built-in, experimental) – This solver does not require additional configuration and works out of the box, but should be considered experimental. It can be expected that other solvers run much faster. Moreover, due to minor numerical stability problems the resulting output may be slightly different.
  • Gurobi – Follow the official installation guide to install the solver. If you use a 32 bits Java version, make sure to install the 32 bits Gurobi version (the same applies to a 64 bits architecture). Gurobi includes a .jar library file in the lib subdirectory of the Gurobi installation, which you should copy to the libs directory of SolvePOMDP.

The linear programming solver of your choice can be defined in the configuration file.

4. Running SolvePOMDP

You can start the POMDP solver by executing run partpainting.POMDP, in which the filename refers to a domain description file in the domains directory. You may need to change the permissions of the files before executing the commands. The output should be similar to the following example:

SolvePOMDP v0.0.3
Author: Erwin Walraven
Web: erwinwalraven.nl/solvepomdp
Delft University of Technology

=== SOLVER PARAMETERS ===
Epsilon: 1.0E-6
Value function tolerance: 1.0E-6
Accelerated LP threshold: 200
Accelerated LP tolerance: 1.0E-4
LP coefficient threshold: 1.0E-9
Time limit: 100.0
Belief sampling runs: 200
Belief sampling steps: 100
Dump policy graph: true
Dump action labels: false
Algorithm: gip
LP solver: LPSolve
Output directory: output
Domain directory: domains

=== READ POMDP FILE ===
File: domains/partpainting.POMDP
PARSER: Parsing preamble...
PARSER: Summary -> states 4
                -> observations 2
                -> actions 4
PARSER: Parsing starting state/belief...
PARSER: Parsing parameters...
PARSER: [DONE]

=== RUN POMDP SOLVER ===
Algorithm: Generalized incremental pruning with accelerated pruning

Stage 1: 4 vectors
Stage 2: 7 vectors, diff 0.8549999999999999, time elapsed 0.012 sec
Stage 3: 16 vectors, diff 0.3096130087209302, time elapsed 0.032 sec
...
Stage 235: 9 vectors, diff 1.0332681084079745E-6, time elapsed 4.695 sec
Stage 236: 9 vectors, diff 0.0, time elapsed 4.708 sec
Stage 237: 9 vectors, diff 0.0, time elapsed 4.723 sec

=== RESULTS ===
Expected value: 3.293579366886754
Alpha vectors: output/partpainting.alpha
Policy graph: output/partpainting.pg
Running time: 4.723 sec

The output files can be found in the output directory, which represent the resulting solution. They can be used to select actions which maximize the expected value.

5. Important remarks

  • The exact algorithm can be used to create a policy graph which represents the optimal solution. This option should be used after the solver has fully converged to an optimal solution. It can be enabled manually in the configuration file.
  • The solver contains a state-of-the-art pruning algorithm (Walraven and Spaan 2017), which delivers improved performance in larger domains. This feature has been enabled by default. If necessary it can be disabled in the configuration file.
  • SolvePOMDP should run flawlessly on Java 7 or higher. For other versions, it is recommended to compile from source in order to construct a working jar file.

The zip file contains all the files that are required to get started. The steps below will guide you through the setup and it explains how to run SolvePOMDP.

1. Directory structure

After extracting the zip archive you will find a domains directory, which includes POMDP descriptions based on Tony's POMDP file format. Your own POMDP domains can be added to this directory. When solving a POMDP model, the solver generates files in the output directory. This can be either in value function format, or in policy graph format. For more details about the file formats and example domains we refer to pomdp.org.

2. Configuration file

The solver.config file allows you to modify a few parameters of the POMDP solver. The documentation in this file explains the parameters and how to set them properly before executing SolvePOMDP. The default configuration should be appropriate in most cases. An important parameter is the algorithm that should be used, which can be either exact or approximate.

3. Linear programming solvers

POMDP algorithms require a linear programming solver during their execution. SolvePOMDP comes with two built-in linear programming solvers that can be used without any additional configuration. It also supports a commercial linear programming solver.

  • LPSolve (built-in) – A recent version of LPSolve has been included in SolvePOMDP. By default it uses the 32 bits version. If you want to use the 64 bits version, then it is required to modify the run script prior to executing it using a text editor. The architecture should match the architecture of your Java installation, rather than the operating system.
  • JOptimizer (built-in, experimental) – This solver does not require additional configuration and works out of the box, but should be considered experimental. It can be expected that other solvers run much faster. Moreover, due to minor numerical stability problems the resulting output may be slightly different.
  • Gurobi – Follow the official installation guide to install the solver. Next, open setup_gurobi in a text editor and make sure that the path to the Gurobi directory is correct. Gurobi includes a .jar library file in the lib subdirectory of the Gurobi installation, which you should copy to the libs directory of SolvePOMDP. Before starting the program, execute source setup_gurobi in order to set the required environment variable.

The linear programming solver of your choice can be defined in the configuration file.

4. Running SolvePOMDP

You can start the POMDP solver by executing ./run partpainting.POMDP, in which the filename refers to a domain description file in the domains directory. You may need to change the permissions of the script before executing the command. The output should be similar to the following example:

SolvePOMDP v0.0.3
Author: Erwin Walraven
Web: erwinwalraven.nl/solvepomdp
Delft University of Technology

=== SOLVER PARAMETERS ===
Epsilon: 1.0E-6
Value function tolerance: 1.0E-6
Accelerated LP threshold: 200
Accelerated LP tolerance: 1.0E-4
LP coefficient threshold: 1.0E-9
Time limit: 100.0
Belief sampling runs: 200
Belief sampling steps: 100
Dump policy graph: true
Dump action labels: false
Algorithm: gip
LP solver: LPSolve
Output directory: output
Domain directory: domains

=== READ POMDP FILE ===
File: domains/partpainting.POMDP
PARSER: Parsing preamble...
PARSER: Summary -> states 4
                -> observations 2
                -> actions 4
PARSER: Parsing starting state/belief...
PARSER: Parsing parameters...
PARSER: [DONE]

=== RUN POMDP SOLVER ===
Algorithm: Generalized incremental pruning with accelerated pruning

Stage 1: 4 vectors
Stage 2: 7 vectors, diff 0.8549999999999999, time elapsed 0.012 sec
Stage 3: 16 vectors, diff 0.3096130087209302, time elapsed 0.032 sec
...
Stage 235: 9 vectors, diff 1.0332681084079745E-6, time elapsed 4.695 sec
Stage 236: 9 vectors, diff 0.0, time elapsed 4.708 sec
Stage 237: 9 vectors, diff 0.0, time elapsed 4.723 sec

=== RESULTS ===
Expected value: 3.293579366886754
Alpha vectors: output/partpainting.alpha
Policy graph: output/partpainting.pg
Running time: 4.723 sec

The output files can be found in the output directory, which represent the resulting solution. They can be used to select actions which maximize the expected value.

5. Important remarks

  • The exact algorithm can be used to create a policy graph which represents the optimal solution. This option should be used after the solver has fully converged to an optimal solution. It can be enabled manually in the configuration file.
  • The solver contains a state-of-the-art pruning algorithm (Walraven and Spaan 2017), which delivers improved performance in larger domains. This feature has been enabled by default. It necessary it can be disabled in the configuration file.
  • SolvePOMDP should run flawlessly on Java 7 or higher. For other versions, it is recommended to compile from source in order to construct a working jar file.

The zip file contains all the files that are required to get started. The steps below will guide you through the setup and it explains how to run SolvePOMDP.

1. Directory structure

After extracting the zip archive you will find a domains directory, which includes POMDP descriptions based on Tony's POMDP file format. Your own POMDP domains can be added to this directory. When solving a POMDP model, the solver generates files in the output directory. This can be either in value function format, or in policy graph format. For more details about the file formats and example domains we refer to pomdp.org.

2. Configuration file

The solver.config file allows you to modify a few parameters of the POMDP solver. The documentation in this file explains the parameters and how to set them properly before executing SolvePOMDP. The default configuration should be appropriate in most cases. An important parameter is the algorithm that should be used, which can be either exact or approximate.

3. Linear programming solvers

POMDP algorithms require a linear programming solver during their execution. SolvePOMDP comes with two built-in linear programming solvers that can be used without any additional configuration. It also supports a commercial linear programming solver.

  • LPSolve – A recent version of LPSolve has been included in SolvePOMDP. However, LPSolve needs to be compiled separately, for which a manual is provided in the lpsolve_src directory.
  • JOptimizer (built-in, experimental) – This solver does not require additional configuration and works out of the box, but should be considered experimental. It can be expected that other solvers run much faster. Moreover, due to minor numerical stability problems the resulting output may be slightly different.
  • Gurobi – Follow the official installation guide to install the solver. Gurobi includes a .jar library file in the lib subdirectory of the Gurobi installation, which you should copy to the libs directory of SolvePOMDP.

The linear programming solver of your choice can be defined in the configuration file.

4. Running SolvePOMDP

You can start the POMDP solver by executing either ./run_gurobi partpainting.POMDP, ./run_lpsolve partpainting.POMDP or ./run_joptimizer partpainting.POMDP in which the filename refers to a domain description file in the domains directory. You may need to change the permissions of the files before executing the commands. The linear programming solver selected in the configuration file should match the script that you use. The output should be similar to the following example:

SolvePOMDP v0.0.3
Author: Erwin Walraven
Web: erwinwalraven.nl/solvepomdp
Delft University of Technology

=== SOLVER PARAMETERS ===
Epsilon: 1.0E-6
Value function tolerance: 1.0E-6
Accelerated LP threshold: 200
Accelerated LP tolerance: 1.0E-4
LP coefficient threshold: 1.0E-9
Time limit: 100.0
Belief sampling runs: 200
Belief sampling steps: 100
Dump policy graph: true
Dump action labels: false
Algorithm: gip
LP solver: LPSolve
Output directory: output
Domain directory: domains

=== READ POMDP FILE ===
File: domains/partpainting.POMDP
PARSER: Parsing preamble...
PARSER: Summary -> states 4
                -> observations 2
                -> actions 4
PARSER: Parsing starting state/belief...
PARSER: Parsing parameters...
PARSER: [DONE]

=== RUN POMDP SOLVER ===
Algorithm: Generalized incremental pruning with accelerated pruning

Stage 1: 4 vectors
Stage 2: 7 vectors, diff 0.8549999999999999, time elapsed 0.012 sec
Stage 3: 16 vectors, diff 0.3096130087209302, time elapsed 0.032 sec
...
Stage 235: 9 vectors, diff 1.0332681084079745E-6, time elapsed 4.695 sec
Stage 236: 9 vectors, diff 0.0, time elapsed 4.708 sec
Stage 237: 9 vectors, diff 0.0, time elapsed 4.723 sec

=== RESULTS ===
Expected value: 3.293579366886754
Alpha vectors: output/partpainting.alpha
Policy graph: output/partpainting.pg
Running time: 4.723 sec

The output files can be found in the output directory, which represent the resulting solution. They can be used to select actions which maximize the expected value.

5. Important remarks

  • The exact algorithm can be used to create a policy graph which represents the optimal solution. This option should be used after the solver has fully converged to an optimal solution. It can be enabled manually in the configuration file.
  • The solver contains a state-of-the-art pruning algorithm (Walraven and Spaan 2017), which delivers improved performance in larger domains. This feature has been enabled by default. It necessary it can be disabled in the configuration file.
  • SolvePOMDP should run flawlessly on Java 7 or higher. For other versions, it is recommended to compile from source in order to construct a working jar file.

  Literature

More information about POMDPs and solution algorithms can be found in the literature mentioned below.

Accelerated Vector Pruning for Optimal POMDP Solvers

Erwin Walraven and Matthijs T. J. Spaan
Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017.

Perseus: Randomized Point-based Value Iteration for POMDPs

Matthijs T. J. Spaan and Nikos Vlassis
Journal of Artificial Intelligence Research, 24, pp. 195–220, 2005.

Partially Observable Markov Decision Processes

Matthijs T. J. Spaan
Reinforcement Learning: State of the Art, pp. 387–414, Springer Verlag, 2012.

Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes

Anthony Cassandra, Michael L. Littman and Nevin L. Zhang
Proceedings of the 13th Conference on Uncertainty in Artificial Intelligence, pp. 54–61, 1997.