
After talking with my GSoC mentors about what we all believe to be the most difficult part of the Asadpour algorithm, the Held-Karp relaxation, we came to several conclusions:
- The Asadpour paper recommends using the ellipsoid method so that their algorithm runs in polynomial time. We do not need a polynomial time, just an algorithm with reasonable execution time. An example of this would be the ellipsoid algorithm versus the simplex algorithm. While the simplex algorithm is exponential, in practice it is almost always faster than the ellipsoid algorithm.
- Our interest in the ellipsoid algorithm was not based on performance, but rather the ability for the ellipsoid algorithm to be able to handle a linear program with an exponential number of constraints. This was done with a separation oracle, see my post here for more information about the oracle.
- Implementing a robust ellipsoid algorithm solver (something notable missing from the scientific python ecosystem) was a GSoC project onto itself and beyond the scope of this project for NetworkX.
Thus, alternative methods for solving the Held-Karp relaxation needed to be investigated. To this end, we turned to the original 1970 paper by Held and Karp, The Traveling Salesman Problem and Minimum Spanning Trees to see how they proposed solving the relaxation (Note that this paper was published before the ellipsoid algorithm was applied to linear programming in 1979). The Held and Karp paper discusses three methods for solving the relaxation:
- Column Generating: An older method of solving very large linear programs where only the variables that influence the optimal solution need to be examined.
- Ascent Method: A method based around maximizing the dual of the linear program which is best described as seeking the direction of ascent for the objective function in a similar way to the notion of a gradient in multivariate calculus.
- Branch and Bound: This method has the most theoretical benefits and seeks to augment the ascent method to avoid the introduction of fractional weights which are the largest contributors to a slow convergence rate.
But before we explore the methods that Held and Karp discuss, we need to ensure that these methods still apply to solving the Held-Karp relaxation within the context of the Asadpour paper. The definition of the Held-Karp relaxation that I have been using on this blog comes from the Asadpour paper, section 3 and is listed below.
The closest match to this program in the Held Karp paper is their linear program 3, which is a linear programming representation of the entire traveling salesman problem, not solely the relaxed version. Note that Held and Karp were dealing with the symmetric TSP (STSP) while Asadpour is addressing the asymmetric or directed TSP (ATSP).
The last two constraints on the second linear program is correctly bounded and fits within the scope of the original problem while the first two constraints do most of the work in finding a TSP tour.
Additionally, changing the last two constraints to be
Held and Karp also state that
In this section, we show that minimizing the gap
is equivalent to solving this program without the integer constraints.
on page 1141, so it would appear that solving one of the equivalent programs that Held and Karp forumalate should work here.
Column Generation Technique#
The Column Generation technique seeks to solve linear program 2 from the Held and Karp paper, stated as
Where
The rest of this method uses a simplex algorithm to solve the linear program. We only focus on the edges which are in each of the 1-Trees, giving each column the form
and the column which enters the solution in the 1-Tree for which
I am already familiar with the simplex method, so I will not detail it’s implementation here.
Performance of the Column Generation Technique#
This technique is slow to converge.
Held and Karp programmed in on an IBM/360 and where able to solve problems consestinal for up to
They also talk about a heuristic procedure in which a vertex is eliminated from the program whenever the choice of its adjacent vertices was ’evident’. Technical details for the heuristic where essentially non-existent, but
The procedure showed promise on examples up to
, but was not explored systematically
Ascent Method#
This paper from Held and Karp is about minimizing
This method uses the set of indices of 1-Trees that are of minimum weight with respect to the weights
If
Now for a sufficiently small
So in other words,
If we can find
- Set
equal to the zero -vector. - Find a 1-tree
such that . - If
STOP. for- GO TO 2.
There are two things which must be refined about this procedure in order to make it implementable in Python.
- How do we find the 1-Tree mentioned in step 2?
- How do we know when there is no direction of ascent? (i.e. how do we know when we are at the maximal value of
?)
Held and Karp have provided guidance on both of these points.
In section 6 on matroids, we are told to use a method developed by Dijkstra in A Note on Two Problems in Connexion with Graphs, but in this particular case that is not the most helpful.
I have found this document, but there is a function called minimum_spanning_arborescence
already within NetworkX which we can use to create a minimum 1-Arborescence.
That process would be to find a minimum spanning arborescence on only the vertices in
Finally, at the maximum value of
Thus, when failure to terminate is suspected, it is necessary to check whether no direction of ascent exists; by the Minkowski-Farkas lemma this is equivalent to the existence of nonnegative coefficients
such that
This can be checked by linear programming.
While it is nice that they gave that summation, the rest of the linear program would have been useful too. The entire linear program would be written as follows
This linear program is not in standard form, but it is not difficult to convert it. First, change the maximization to a minimization by minimizing the negative.
While the constraint is not intuitively in standard form, a closer look reveals that it is.
Each column in the matrix form will be for one entry of
Because all of the summations must equal zero, no stack and surplus variables are required, so the constraint matrix for this program is linprog
function in the SciPy library.
As an implementation note, to start with I would probably check the terminating condition every iteration, but eventually we can find a number of iterations it has to execute before it starts to check for the terminating condition to save computational power.
One possible difficulty with the terminating condition is that we need to run the linear program with data from every minimum 1-Trees or 1-Arborescences, which means that we need to be able to generate all of the minimum 1-Trees. There does not seem to be an easy way to do this within NetworkX at the moment. Looking through the tree algorithms here they seem exclusively focused on finding one minimum branching of the required type and not all of those branchings.
Now we have to find
Let
be any element of , where is a direction of ascent at . Then
The first step then is to determine if
From that equation, we can derive a formula for
So we can now find find_cycle
within the NetworkX library.
Below is a pseudocode procedure that uses Theorem 4 to find
# Input: An element k of K(pi, d), the vector pi and the vector d.
# Output: epsilon(pi, d) using Theorem 4 on page 1150.
for each edge e in the graph G
if e is in k:
continue
else:
add e to k
let v be the terminating end of e
c = find_cycle(k, v)
for each edge a in c not e:
if a[cost] = e[cost] and d[i] + d[j] = d[r] + d[s]:
continue
epsilon = (a[cost] - e[cost])/(d[i] + d[j] - d[r] - d[s])
min_epsilon = min(min_epsilon, epsilon)
remove e from k
return min_epsilon
Performance of the Ascent Method#
The ascent method is also slow, but would be better on a modern computer.
When Held and Karp programmed it, they tested it on some small problems up to 25 vertices and while the time per iteration was small, the number of iterations grew quickly.
They do not comment on if this is a better method than the Column Generation technique, but do point up that they did not determine if this method always converges to a maximum point of
Branch and Bound Method#
After talking with my GSoC mentors, we believe that this is the best method we can implement for the Held-Karp relaxation as needed by the Asadpour algorithm. The ascent method is embedded within this method, so the in depth exploration of the previous method is required to implement this one. Most of the notation in this method is reused from the ascent method.
The branch and bound method utilizes the concept that a vertex can be out-of-kilter.
A vertex
Similarly, vertex
Remember that
If we know that a vertex is out-of-kilter in either direction, we know the direction of ascent and that direction is a unit vector.
Let
Corollaries 3 and 4 from page 1151 also show that finding
Corollary 3. Assume vertex
is out-of-kilter low and let be an element of . Then such that is a substitute for in and .
Corollary 4. Assume vertex
is out-of-kilter high. Then such that is a substitute for in and .
These corollaries can be implemented with a modified version of the pseudocode listing above for finding
Once there are no more out-of-kilter vertices, the direction of ascent is not a unit vector and fractional weights are introduced. This is the cause of a major slow down in the convergence of the ascent method to the optimal solution, so it should be avoided if possible.
Before we can discuss implementation details, there are still some more primaries to be reviewed.
Let
From these functions, a revised definition of out-of-kilter high and low arise, allowing a vertex to be out-of-kilter relative to
During the completion of the branch and bound method, the branches are tracking in a list where each entry has the following format.
Where
At each iteration of the method, we consider the list entry with the minimum bound and try to find an out-of-kilter vertex. If we find one, we apply one iteration of the ascent method using the simplified unit vector as the direction of ascent. Here we can take advantage of integral weights if they exist. Perhaps the documentation for the Asadpour implementation in NetworkX should state that integral edge weights will perform better but that claim will have to be supported by our testing.
If there is not an out-of-kilter vertex, we still need to find the direction of ascent in order to determine if we are at the maximum of
The branching process is as follows.
From entry
An example of the branch and bound method is given on pages 1153 through 1156 in the Held and Karp paper.
In order to implement this method, we need to be able to determine in addition to modifying some of the details of the ascent method.
- If a vertex is either out-of-kilter in either direction with respect to
and . - Search
for a tour.
The Held and Karp paper states that in order to find an out-of-kilter vertex, all we need to do is test the unit vectors.
If for arbitrary member
Searching
The most promising research paper I have been able to find on this problem is this 2005 paper by Sörensen and Janssens titled An Algorithm to Generate all Spanning Trees of a Graph in Order of Increasing Cost.
From here we generate spanning trees or arborescences until the cost moves upward at which point we have found all elements of
Performance of the Branch and Bound Method#
Held and Karp did not program this method.
We have some reason to believe that the performance of this method will be the best due to the fact that it is designed to be an improvement over the ascent method which was tested (somewhat) until
References#
A. Asadpour, M. X. Goemans, A. Mardry, S. O. Ghran, and A. Saberi, An o(log n / log log n)-approximation algorithm for the asymmetric traveling salesman problem, Operations Research, 65 (2017), pp. 1043-1061, https://homes.cs.washington.edu/~shayan/atsp.pdf.
Held, M., Karp, R.M. The traveling-salesman problem and minimum spanning trees. Operations research, 1970-11-01, Vol.18 (6), p.1138-1162. https://www.jstor.org/stable/169411