Institut[e] f(o|ü)r Informati(cs|k), [Universität] Osnabrück [University]
Here, you can find the benchmark sets used in our study of CSP and CSSP. Detailed descriptions can be found in the paper:
Type | Description | Filename | Solutions |
---|---|---|---|
CSP | Random | csp_rnd.tar.bz2 (135 MB) | results_csp_rnd.csv |
CSP | McClure 1) | mcclure.tar.bz2 (2.7 KB) | results_mcclure.csv |
CSP | Hufsky 2) | hufsky.tar.bz2 (175 KB) | results_hufsky.csv |
CSSP | Random | cssp_rnd.tar.bz2 (119 KB) | results_cssp_rnd.csv |
The tar.bz2-files contain simple text files where each text file contains a single instance. The naming scheme for these files is as follows:
Random instances: |Σ|-k-n-r-i.csp McClure instances: McClure-p-|Σ|-k-n.csp Hufsky instances: Hufsky-k-n-i.csp
with
Symbol | Description |
---|---|
p | page of McClure paper |
|Σ| | size of alphabet Σ |
k | number of strings |
n | length of strings |
r | distance ratio such that α=n/r |
i | instance number (10 instances are generated for each parameter combination for random instances, 100 for Hufsky instances) |
The format of a text file is as follows:
<Alphabet-size (|Σ|)> <Number of Strings (k)> <Length of Strings (n)> <1st Character of the Alphabet Σ> ... <|Σ|-th Character of the Alphabet Σ> <1st String> ... <k-th String>
The tar.bz2-files contain simple text files where each text file contains a single instance. The naming scheme for these files is as follows:
{i,r}-|Σ|-k-n-l-d-{l,e}-i.cssp
with
Symbol | Description |
---|---|
{i,r} | i stands for implanted motif, r represents random instances |
|Σ| | size of alphabet Σ |
k | number of strings |
n | length of strings |
l | length of target string |
d | distance |
{l,e} | e stands for instances with E(l,d)≈1, l stands for instances with E(l,d)≈0.01, omitted in case of random instances |
i | instance number (5 instances are generated for each parameter combination) |
For implanted motif instances there are also files ending with .sup. These files contain supplemental information about the instances (see below).
The format of a .cssp file is as follows:
<Alphabet Size (|Σ|)> <Number of Strings (k)> <Length of Strings (n)> <Length of Target String (l)> <1st Character of the Alphabet Σ> ... <|Σ|-th Character of the Alphabet Σ> <1st String> ... <k-th String>
The format of a .sup file is as follows:
<Implanted Master Substring> <Distance (d) of the Implanted Master Substring> <Position of Implanted Master Substring in the 1st String> ... <Position of Implanted Master Substring in the k-th String>
The solution files are plain text files in the CSV format with a semicolon as delimiter. The first line of a file serves as a column header. Each further line represents a specific instance. The column headers are as follows:
Header | Description |
---|---|
filename | the filename of the specific instance |
formulation | only for CSSP instances: formulation used to solve the instance where d stands for δ |
lb | lower bound of the solution value |
ub | upper bound of the solution value |
time | running time in seconds |