GTSP Instances Library

This is a home page of the GTSP Instances Library. It is created and maintained by Daniel Karapetyan. If you notice any mistake, obtain an improvement over the provided solution or just have a question or a suggestion regarding the library, please contact me at DANIEL dot KARAPETYAN at GMAIL dot COM .

If you use this library in your research, please mention it as "GTSP Instances Library (Daniel Karapetyan, http://www.cs.nott.ac.uk/~dxk/gtsp.html)" and cite [1].

Introduction

The Generalized Traveling Salesman Problem (GTSP) is an extension of the Traveling Salesman Problem (TSP), where the node set is partitioned into clusters, and the objective is to find the shortest cycle that visits exactly (or, in some variations) at least) one node in each cluster. The problem is NP-hard.

In 1997, Fischetti, Salazar Gonzalez and Toth [2] introduced a simple clustering procedure that can be used for creating GTSP instances out of TSP instances. The TSP instances were taken from a well known TSP instances library, TSPLIB, created by Gerhard Reinelt [3]. Since then, such a testbed became a de facto standard that was used in virtually all the GTSP literature. On this website, you will find these instances in several formats. You will also find the best known solutions for each of these instances.

Notation

In what follows, we use N for the number of nodes, M for the number of clusters, C_i for the number of nodes (the size) of Cluster i and w(x,y) for the weight of the edge (x,y).

Each instance may have any combination of the following two attributes: symmetric and triangle. The symmetric attribute means that the weight matrix of the instance is symmetric, i.e., w(x,y) = w(y,x) for each pair of nodes x and y. The triangle attribute means that the weight matrix satisfies the triangle inequality, i.e., w(x,y) + w(y,z) ≥ w(x,z) for each triple of nodes x, y and z.

Instance Naming

Each instance in the GTSP Instances Library is named as follows: <M><type><N>, where <type> is derived from the TSPLIB instance. For example, the 89rbg443 GTSP instance has 89 clusters, 443 nodes and is of the "rbg" class.

In most cases, the <M><type><N> GTSP instance is obtained from the <type><N> TSP instance. For example, the 89rbg443 GTSP instance is obtained from the rbg443 TSP instance. The only exception is the 8ftv36 instance which is obtained from the ftv35 TSP instance.

Optimal and Best Solutions

For a number of instances, the optimal solutions are proven, for details see [2]. For the rest of the instances, we provide the best known so far solutions. As for now, all these solutions were obtained by Gutin and Karapetyan, see [1]. If you find a solution of any of these instance better than the one reported in this library, please send it to me ( DANIEL dot KARAPETYAN at GMAIL dot COM ).

Downloads

In this section you can find the links to download the GTSP instances and the best known solution in different formats described below. As for now, the largest instance used in the GTSP literature is 217vm1084 and, thus, the files below contain all the instances, both symmetric and asymmetric, of size N ≤ 1084.

Text Format (Instances)

This format is designed to simplify the instance reading procedure. It contains only the most necessary information. It starts from the : the instance attributes, the number of nodes, the number of clusters, the nodes within each cluster and the explicit weight matrix.

The file in this format consists of three sections: the header, the clusters and the weight matrix.

The header section takes 4 lines. Each line starts from the name of the values ("N", "M", "Symmetric" or "Triangle") and then specifies the corresponding value. The order of the lines is fixed.

The clusters section contains exactly M lines. The ith line corresponds to the Cluster i. It starts from the size C_i of Cluster i and then lists all the nodes in this cluster.

The weight matrix section contains exactly N lines. The ith line corresponds to the weights from node i. Each line contains exactly N integers. The jth integer corresponds to the weight of the edge from Node i to Node j.

N: <Number of nodes N>
M: <Number of clusters M>
Symmetric: <"true" or "false">
Triangle: <"true" or "false">
<C₁> <Node 1 in Cluster 1> <Node 2 in Cluster 1> ... <Node C₁ in Cluster 1>
<C₂> <Node 1 in Cluster 2> <Node 2 in Cluster 2> ... <Node C₂ in Cluster 2>
...
<C_M> <Node 1 in Cluster M> <Node 2 in Cluster M> ... <Node C_M in Cluster M>
<w(1,1)> <w(1,2)> ... <w(1,N)>
<w(2,1)> <w(2,2)> ... <w(2,N)>
...
<w(N,1)> <w(N,2)> ... <w(N,N)>

Note that the nodes in this format are numbered starting from 1.

Binary Format (Instances)

This is a binary format for an instance file. It includes exactly the same information as the text format above but due to the binary coding provides a significantly higher performance of the reading procedure.

The file in this format is a sequence of 3 + M + N + N² integer values, each 4 bytes long. The values are as follows:

1 integer. The number of nodes N.
1 integer. The number of clusters M.
1 integer. The instance attributes: 3 if symmetric and triangle, 2 if asymmetric and triangle, 1 if symmetric and non-triangle and 0 otherwise.
M + N integers. The sizes and the contents of the clusters:
<C₁> <Node 1 in Cluster 1> <Node 2 in Cluster 1> ... <Node C₁ in Cluster 1>
<C₂> <Node 1 in Cluster 2> <Node 2 in Cluster 2> ... <Node C₂ in Cluster 2>
...
<C_M> <Node 1 in Cluster M> <Node 2 in Cluster M> ... <Node C_M in Cluster M>.
N² integers. The weight matrix:
<w(1,1)> <w(1,2)> ... <w(1,N)>
<w(2,1)> <w(2,2)> ... <w(2,N)>
...
<w(N,1)> <w(N,2)> ... <w(N,N)>.

Note that the nodes in this format are numbered starting from 0.

TSPLIB Text Format (Instances)

This format for the GTSP instances was proposed by Alexey Zverovich. It is an extension of the TSPLIB format and, thus, is flexible but more difficult for instance reading then the format described above. The main advantage of this format is that it provides more information about the instance. For example, for a Euclidean instance, instead of the explicit weight matrix, it provides the nodes coordinates that can be used by some algorithm.

The format follows all the rules defined by the TSPLIB documentation. It defined a new section called "GTSP_SET_SECTION" containing the information about the clusters.

There are M entries separated by white spaces symbols. The ith entry corresponds to Cluster i. It starts from the cluster index i, then lists all the nodes in this cluster and terminates with the -1 value:

1 <Node 1 in Cluster 1> <Node 2 in Cluster 1> ... < Node C₁ in Cluster 1> -1
2 <Node 1 in Cluster 2> <Node 2 in Cluster 2> ... < Node C₂ in Cluster 2> -1
...
3 <Node 1 in Cluster M> <Node 2 in Cluster M> ... < Node C_M in Cluster M> -1

The file in this format consists of three sections: the header, the clusters and the weight matrix.

The header section takes 4 lines. Each line starts from the name of the values ("N", "M", "Symmetric" or "Triangle") and then specifies the corresponding value. The order of the lines is fixed.

The clusters section contains exactly M lines. The ith line corresponds to the Cluster i. It starts from the size C_i of Cluster i and then lists all the nodes in this cluster.

N: <Number of nodes N>
M: <Number of clusters M>
Symmetric: <"true" or "false">
Triangle: <"true" or "false">
<C₁> <Node 1 in Cluster 1> <Node 2 in Cluster 1> ... <Node C₁ in Cluster 1>
<C₂> <Node 1 in Cluster 2> <Node 2 in Cluster 2> ... <Node C₂ in Cluster 2>
...
<C_M> <Node 1 in Cluster M> <Node 2 in Cluster M> ... <Node C_M in Cluster M>
<w(1,1)> <w(1,2)> ... <w(1,N)>
<w(2,1)> <w(2,2)> ... <w(2,N)>
...
<w(N,1)> <w(N,2)> ... <w(N,N)>

Note that the nodes in this format are numbered starting from 1.

Text Format (Solutions)

A file in this format consists of two sections: the header and the tour.

The header section takes 2 lines. The first line contains exactly one integer corresponding to the number M of the nodes in the tour. The second line contains exactly one integer corresponding to the weight of the tour.

The tour section takes M lines. The ith line contains exactly one integer corresponding to the ith node in the tour.

<Tour size M>
<Tour weight>
<Node 1 in the tour>
<Node 2 in the tour>
...
<Node M in the tour>

Note that the nodes in this format are numbered starting from 1.

Binary Format (Solutions)

This is a binary format for a solution file. It includes exactly the same information as the text format above but due to the binary coding provides a significantly higher performance of the reading and writing procedures.

The file in this format is a sequence of 2 + M integer values, each 4 bytes long. The values are as follows:

1 integer. The number of clusters M.
1 integer. The weight of the solution.
M integers. The tour:
<Node 1 in the tour>
<Node 2 in the tour>
...
<Node M in the tour>

Note that the nodes in this format are numbered starting from 0.

References

[1] Gutin, G., Karapetyan, D. (2009). A memetic algorithm for the generalized traveling salesman problem. Natural Computing, 9(1), 47-60.

[2] Fischetti, M., Salazar Gonzalez, J. J., Toth, P. (1997). A branch-and-cut algorithm for the symmetric generalized traveling salesman problem. Operations Research, 45(3), 378-394.

[3] Reinelt, G. (1991). TSPLIB--A Traveling Salesman Problem Library. INFORMS Journal on Computing, 3(4), 376-384.

Contacts

If you have any questions or suggestions, please email me at DANIEL dot KARAPETYAN at GMAIL dot COM .

Last modified July 4, 2012.