The Traveling Salesman Problem (TSP) is the most popular and most studied combinatorial problem, starting with von Neumann in 1951. It has driven the discovery of several optimization techniques such as cutting planes, branch-and-bound, local search, Lagrangian relaxation, and simulated annealing.
The last five years have seen the emergence of promising techniques where (graph) neural networks have been capable to learn new combinatorial algorithms. The main question is whether deep learning can learn better heuristics from data, i.e. replacing human-engineered heuristics? This is appealing because developing algorithms to tackle efficiently NP-hard problems may require years of research, and many industry problems are combinatorial by nature.
In this work, we propose to adapt the recent successful Transformer architecture originally developed for natural language processing to the combinatorial TSP. Training is done by reinforcement learning, hence without TSP training solutions, and decoding used beam search. We report improved performances over recent learned heuristics with an optimal gap of 0.004% for TSP50 and 0.50% for TSP100.