SMT-RAT
24.02
Toolbox for Strategic and Parallel Satisfiability-Modulo-Theories Solving
|
Benchmax is a tool for automated benchmarking. Though it was developed and is primary used for testing SMT solvers, it aims to be agnostic of the tools and formats as good as possible.
Its fundamental model is to load a list of tools and a list of benchmarks, run every tool with every benchmark, obtain the results and output all the results. While the benchmarks are fixed to a single file, we allow to choose between different tool interfaces, execution backends and output formats.
Benchmax could be called as follows:
This will run benchmax with the SMT-RAT tool at /path/to/smtrat-static
on the /path/to/benchmark/set
with a time limit of 1 minute and a memory limit of 4 gigabyte. The resulst will be written to out.xml
. The execution is done on an SSH backend on localhost with 9 connections and executing 5 parallel jobs per connection.
It is recommended to use static builds to avoid issues with missing libraries.
For more information, run ./benchmax --help
.
Some tools like SMT-RAT or Z3 can provide statistics about the solving process for each individual benchmark. By using the -s
respectively --statistics
, statistics are collected and stored in the output file.
It is recommended to aggregate statistics as much as possible inside SMT-RAT. However, if the output might get large, you might want to use --use-tmp
to prevent benchmax running out of memory.
In the SMT-RAT repository, two utilites for converting the result XML file are included:
utilities/benchmax/xml2ods.py
for converting it to a Flat XML LibreOffice Calc Sheet.utilities/benchmax/OOCImporter.xsl
for converting it to a Flat XML LibreOffice Calc Sheet.utilities/benchmax/evaluation
is a small python library for importing the results into Python (or a Jupyter notebook), inspecting the results and preparing plots. For more information, see Benchmax python utility.To pass options to the tested tool, use a string consisting of the path to the tool and the required options instead of only the path. In the example above, if we wanted to use the SMT-RAT tool at /path/to/smtrat-static
with some option --op
and argument arg
, we would call
(...
being the same as before).
A tool represents a binary that can be executed on some input files. It is responsible for deciding whether it can be applied to some given file, building a command line to execute it and retrieve additional results from stdout
or stderr
.
The generic tool class benchmax::Tool
can be used as a default. It will run the given binary on any input file and benchmax records the exit code as well as the runtime.
If more information is needed a new tool class can be derived that overrides or extends the default behaviour. Some premade tools are available (and new ones should be easy to create):
A backend offers the means to run the tasks in a specific manner. A number of different backends are implemented that allow for using benchmax in various scenarios.
The LocalBackend can be used to execute benchmarks on a local machine easily. It does not allow for any parallelization but simply executes all tasks sequentially.
The SlurmBackend employs the Slurm workload manager to run benchmarks on a cluster. It essentially collects all tasks that shall be run in a file and creates an appropriate slurm submit file. It then submits this job, waits for it to finish and collects the results from the output files.
The Slurm backend supports starting (--mode execute
) and collecting results (--mode collect
) separately, thus avoiding the need for running benchmax in a screen.
The SSHBackend can be used if you have multiple workers that can be reached via ssh (essentially a cluster without a batch job system). It allows to configure one or more computing nodes and manually schedules all the jobs on the different computing nodes.
Using --ssh.node
, a node is specified in the format <user>[:<password>]@<hostname>[:<port = 22>][@<cores = 1>][#<connections = 1>]
; benchmax will then connect to the specified server with given credentials connections
times and will run cores
threads per connection. Thus, the number of overall parallel threads equals connections
* cores
(note that enough memory should be available on the node).
By setting the flag --ssh.resolvedeps
, dependencies (i.e. dynamic libraries) are resolved and uploaded with the solver.
The CondorBackend is aimed at the HTCondor batch system and works similar to the SlurmBackend. Note however that this backend is not well tested and we therefore consider it unstable.
LOGGING=OFF
.