This is the second part of a blog series where I talk about my experience during my Outreachy internship at NetworkX. If you haven’t read the first part you can find it here.
As you advance through the contribution phase you may wonder how your internship is gonna be in case you get selected. Here is my experience as a NetworkX intern and some tips that could help you through the internship.

What’s Outreachy? Outreachy is a paid remote internship program for underrepresented groups in tech. All internships are in Open Source and Open Science. To be selected as an intern first you need to :
Fill out an initial application: You’ll need to answer some questions about how you are affected by the systemic bias, and how being underrepresented in your local tech industry impacted your development. Maybe you don’t know how to answer some of these questions, especially if you are still not looking for a job, but it’s important to do some research first.

The last and final post discussing the VF2++ helpers can be found here. Now that we’ve figured out how to solve all the sub-problems that VF2++ consists of, we are ready to combine our implemented functionalities to create the final solver for the Graph Isomorphism problem.
Introduction We should quickly review the individual functionalities used in the VF2++ algorithm:
Node ordering which finds the optimal order to access the nodes, such that those that are more likely to match are placed first in the order.

The previous post can be found here, be sure to check it out so you can follow the process step by step. Since then, another two very significant features of the algorithm have been implemented and tested: node pair candidate selection and feasibility checks.
Introduction As previously described, in the ISO problem we are basically trying to create a mapping such that, every node from the first graph is matched to a node from the second graph.

This post includes all the major updates since the last post about VF2++. Each section is dedicated to a different sub-problem and presents the progress on it so far. General progress, milestones and related issues can be found here.
Node ordering The node ordering is one major modification that VF2++ proposes. Basically, the nodes are examined in an order that makes the matching faster by first examining nodes that are more likely to match.

Intro I got accepted as a GSoC contributor, and I am so excited to spend the summer working on such an incredibly interesting project. The mentors are very welcoming, communicative, fun to be around, and I really look forward to collaborating with them. My application for GSoC 2022 can be found here.
About me My name is Konstantinos Petridis, and I am an Electrical Engineering student at the Aristotle University of Thessaloniki.

Welcome! This post is not going to be discussing technical implementation details or theortical work for my Google Summer of Code project, but rather serve as a summary and recap for the work that I did this summer.
I am very happy with the work I was able to accomplish and believe that I successfully completed my project.
Overview My project was titled NetworkX: Implementing the Asadpour Asymmetric Traveling Salesman Problem Algorithm.

My implementation of asadpour_atsp is now working! Recall that my pseudo code for this function from my last post was
def asadpour_tsp Input: A complete graph G with weight being the attribute key for the edge weights. Output: A list of edges which form the approximate ATSP solution. z_star = held_karp(G) # test to see if z_star is a graph or dict if type(z_star) is nx.DiGraph return z_star.edges z_support = nx.

Well, we’re finally at the point in this GSoC project where the end is glimmering on the horizon. I have completed the Held Karp relaxation, generating a spanning tree distribution and now sampling from that distribution. That means that it is time to start thinking about how to link these separate components into one algorithm.
Recall that from the Asadpour paper the overview of the algorithm is
Algorithm 1 An \(O(\log n / \log \log n)\)-approximation algorithm for the ATSP

The heavy lifting I did in the preliminary post certainly paid off here! In just one day I was able to implement sample_spanning_tree and its two helper functions.
krichhoffs This was a very easy function to implement. It followed exactly from the pesudo code and was working with spanning_tree_distribution before I started on sample_spanning_tree.
sample_spanning_tree This function was more difficult than I originally anticipated. The code for the main body of the function only needed minor tweaks to work with the specifics of python such as shuffle being in place and returning None and some details about how sets work.

In order to test the exponential distribution that I generate using spanning_tree_distribution, I need to be able to sample a tree from the distribution. The primary citation used in the Asadpour paper is Generating Random Combinatorial Objects by V. G. Kulkarni (1989). While I was not able to find an online copy of this article, the Michigan Tech library did have a copy that I was able to read.
Does the Kulkarni Algorithm work with Asadpour?

Implementing spanning_tree_distribution proved to have some NetworkX difficulties and one algorithmic difficulty. Recall that the algorithm for creating the distribution is given in the Asadpour paper as
Set \(\gamma = \vec{0}\). While there exists an edge \(e\) with \(q_e(\gamma) > (1 + \epsilon) z_e\): Compute \(\delta\) such that if we define \(\gamma’\) as \(\gamma_e’ = \gamma_e - \delta\), and \(\gamma_f’ = \gamma_f\) for all \(f \in E\ \backslash {e}\), then \(q_e(\gamma’) = (1 + \epsilon/2)z_e\).

Finally moving on from the Held Karp relaxation, we arrive at the second step of the Asadpour asymmetric traveling salesman problem algorithm. Referencing the Algorithm 1 from the Asadpour paper, we are now finally on step two.
Algorithm 1 An \(O(\log n / \log \log n)\)-approximation algorithm for the ATSP
Input: A set \(V\) consisting of \(n\) points and a cost function \(c\ :\ V \times V \rightarrow \mathbb{R}^+\) satisfying the triangle inequality.

This should be my final post about the Held-Karp relaxation! Since my last post titled Implementing The Held Karp Relaxation, I have been testing both the ascent method as well as the branch and bound method.
My first test was to use a truly asymmetric graph rather than a directed graph where the cost in each direction happened to be the same. In order to create such a test, I needed to know the solution to any such proposed graphs.

I have now completed my implementation of the ascent and the branch and bound method detailed in the 1970 paper The Traveling-Salesman Problem and Minimum Spanning Trees by Micheal Held and Richard M. Karp. In my last post, titled Understanding the Ascent Method, I completed the first iteration of the ascent method and found an important bug in the find_epsilon() method and found a more efficient way to determine substitutes in the graph.

It has been far longer than I would have preferred since I wrote a blog post. As I expected in my original GSoC proposal, the Held-Karp relaxation is proving to be quite difficult to implement.
My mentors and I agreed that the branch and bound method discussed in Held and Karp’s 1970 paper The Traveling-Salesman Problem and Minimum Spanning Trees which first required the implementation of the ascent method because it is used in the branch and bound method.

We are coming into the end of the first week of coding for the Summer of Code, and I have implemented two new, but related, features in NetworkX. In this post, I will discuss how I implemented them, some of the challenges and how I tested them. Those two new features are a spanning tree iterator and a spanning arborescence iterator.
The arborescence iterator is the feature that I will be using directly in my GSoC project, but I though that it was a good idea to implement the spanning tree iterator first as it would be easier and I could directly refer back to the research paper as needed.

There is only one thing that I need to figure out before the first coding period for GSoC starts on Monday: how to find all of the minimum arborescences of a graph. This is the set \(K(\pi)\) in the Held and Karp paper from 1970 which can be refined down to \(K(\pi, d)\) or \(K_{X, Y}(\pi)\) as needed. For more information as to why I need to do this, please see my last post here.

After talking with my GSoC mentors about what we all believe to be the most difficult part of the Asadpour algorithm, the Held-Karp relaxation, we came to several conclusions:
The Asadpour paper recommends using the ellipsoid method so that their algorithm runs in polynomial time. We do not need a polynomial time, just an algorithm with reasonable execution time. An example of this would be the ellipsoid algorithm versus the simplex algorithm.

Now that my porposal was accepted by NetworkX for the 2021 Google Summer of Code (GSoC), I can get more into the technical details of how I plan to implement the Asadpour algorithm within NetworkX.
In this post I am going to outline my thought process for the control scheme of my implementation and create function stubs according to my GSoC proposal. Most of the work for this project will happen in netowrkx.

Continuing the theme of my last post, we know that the Held-Karp relaxation in the Asadpour Asymmetric Traveling Salesman Problem cannot be practically written into the standard matrix form of a linear program. Thus, we need a different method to solve the relaxation, which is where the ellipsoid method comes into play. The ellipsoid method can be used to solve semi-infinite linear programs, which is what the Held-Karp relaxation is.

In linear programming, we sometimes need to take what would be a integer program and ‘relax’ it, or unbound the values of the variables so that they are continuous. One particular application of this process is Held-Karp relaxation used the first part of the Asadpour algorithm for the Asymmetric Traveling Salesman Problem, where we find the lower bound of the approximation. Normally the relaxation is written as follows.
\[ \begin{array}{c l l} \text{min} & \sum_{a} c(a)x_a \\\ \text{s.