--------------------------------------------------------------------------------------------------
Multi-Language Evaluation of Exact Solvers in Graphical Model Discrete Optimization
Barry Hurley, Barry O'sullivan (Insight Centre for Data Analytics, UCC, Ireland)
David Allouche, George Katsirelos, Thomas Schiex, Matthias Zytnicki, Simon de Givry (INRA, France)
In Proc. of CPAIOR'16, Banff, Canada, 2016
--------------------------------------------------------------------------------------------------

Benchmarks
----------
-MRF: Probabilistic Inference Challenge 2011
http://www.cs.huji.ac.il/project/PASCAL/realBoard.php
-MRF: UAI 2008 Evaluation 
http://graphmod.ics.uci.edu/uai08/Evaluation/Report/Benchmarks
-CVPR: Computer Vision and Pattern Recognition OpenGM2 Benchmark
http://hci.iwr.uni-heidelberg.de/opengm2/
-CFN: CFLib, library of Cost Function Networks
http://costfunction.org/benchmark
-MaxCSP: CSP and Max-CSP 2008 Competition
http://www.cril.univ-artois.fr/CPAI08/
http://www.cril.univ-artois.fr/~lecoutre/benchmarks.html
-WPMS: Weighted Partial Max-SAT Evaluation 2013
http://maxsat.ia.udl.cat:81/13/benchmarks
-CP: MiniZinc Challenge 2012 and 2013
http://www.minizinc.org/challenge2012/results2012.html
http://www.minizinc.org/challenge2013/results2013.html

Download benchmarks in all formats (uai, wcnf, wcsp, lp, mzn&dzn)
-----------------------------------------------------------------
wget -r -nH --no-parent -e robots=off --cut-dirs=1 http://genoweb.toulouse.inra.fr/~degivry/evalgm

or only a specific category, e.g. cost function networks:
wget -r -nH --no-parent -e robots=off --cut-dirs=2 http://genoweb.toulouse.inra.fr/~degivry/evalgm/CFN

or only a specific format, e.g. uai format:
wget -r -nH --no-parent -e robots=off --cut-dirs=1 -A '*.uai' http://genoweb.toulouse.inra.fr/~degivry/evalgm
wget http://genoweb.toulouse.inra.fr/~degivry/evalgm/CVPR/CVPR_uai.tgz

Notice that instances with the tuple encoding have extension "_support" in their filenames.

Solvers
-------
-daoopt  version 1.1.2
https://github.com/lotten/daoopt
-toulbar2 version 0.9.8
http://www.inra.fr/mia/T/toulbar2
-cplex version 12.6.0.0 (Academic License)
-maxhs version 2.51
http://www.maxhs.org
-gecode version 4.4.0
http://www.gecode.org

Parameters
----------
For all solvers, 1 core of AMD Operon 6176 at 2.3 GHz with 3600 seconds CPU-time and 8 GB RAM limits.

Same limits used to generate off-line flatzinc files using MiniZinc 2 tools (MZN reported if limits exceeded).

-daoopt version 1.1.2
daooptW -y -i 35 --slsX=20 --slsT=10 --lds 1 -m 4096 -t 30000 --orderTime=180
-toulbar2 version 0.9.8
 -hbfs -dee -V -A
-cplex version 12.6.0.0
CPX_PARAM_THREADS                1
CPX_PARAM_EPAGAP                 0.00000000000000e+00   
CPX_PARAM_EPGAP                  0.00000000000000e+00   
CPX_PARAM_EPINT                  0.00000000000000e+00   
CPX_PARAM_EPRHS			1e-9
-maxhs version 2.51
no parameter
-gecode version 4.4.0
fzn-gecode -f -a -s

Legends for results.csv columns
-------------------------------
- Problem: instance name
- nbvar: number of variables
- max_dom: maximum domain size
- nbconstr: number of potentials/cost functions/constraints
- max_arity: maximum arity of potentials/cost functions/constraints

Followed by, for each solver in this ordering:
daoopt, toulbar2, cplex (direct encoding), cplex_t (tuple encoding), maxhs (direct encoding), maxhs_t (tuple encoding), gecode
- bestsol: cost of the best solution found (or "?" if no solution found)
- time: CPU time to end the search (or "mem" if memory error occured during search or during the translation process, or "MZN" if flatzinc translation error, or "32-bit" if sum of costs not representable on 32-bit integers for CP domains, or "ARITY" if translation to MRF/potentials not possible due to large clause arity, or "N/A" if the corresponding encoding was not available)
- status: S (solution found), C (optimum found and proved), UNK (no solution found within the time limit)

Note that translation times between formats are not taken into account in the results.

Known problem issues and corrections made
-----------------------------------------
- Instances CP/Warehouse/capmo1-5.dzn, CP/Warehouse/capmp1-5.dzn, CP/Warehouse/capmq1-5.dzn (resp. CP/Warehouse/capa-b-c.dzn) have all their cost values divided by 100 (resp. multiplied by 10) compared to wcsp/wcnf/lp instances. Reported solution costs for gecode in results.csv have been corrected accordingly.
- Instance CVPR/MatchingStereo/ven-gm cannot be translated in lp format (file too big).
- Instance WPMS/MIPLib/normalized-mps-v2-20-10-mod008.opb.msat cannot be translated in uai format (clause arity too large (93)). Tuple encoding was not done for lp and wcnf formats on the whole WPMS benchmark due to large arity clauses.
- CVPR and WPMS/Upgradeability instances have costs greater than 32-bit integer limit, no dzn (MiniZinc) files have been generated for them.
- Problem sizes (nbvar,maxdom,nbconstr,maxarity) reported for CP instances are extracted from wcsp format and not from original mzn format. Wrong numbers are reported when the corresponding wcsp file was not available due to memory space issues in the mzn to wcsp translation (Fastfood ff3-63, Golomb 10-12, OnCallRostering 10s-150d & 20s-200d). Also no uai, wcnf, nor lp files could be generated for these instances.
- Different problem optimum reported bewteen MRF solver daoopt and the other solvers due to double/integer conversions (ImageAlignment/fileforGal_400markers, Auction/cat_sched_60_170_0001, GeomSurf-7, and SceneDecomp). First, MRF instances (except CVPR) were translated into CFN instances using 2-digit precision. Then, the resulting wcsp instances were translated back to uai instances (_digit2 extension) in order to optimize the same objective function.
- All CVPR instances are using 8-digit precision in wcnf, wcsp and lp formats (except some instances already using integer potentials), in uai format are using floating-point numbers, and no dzn (MiniZinc) format have been generated (due to 32-bit integer cost limit).
- Most CVPR (ColorSeg, GeomSurf,ObjectSeg,SceneDecomp) and some WPMS benchmarks have very large optima possibly making optima reported by cplex slightly inaccurate (floating-point 1e-9 precision limit).