A classy approach to solving Traveling Salesman Problems effectively with Python

Here we won’t start from scratch. As mentioned earlier, we have already developed the code that builds a Pyomo model of the TSP and solves it in sprint 3. And trust me, that was the most difficult part. Now, our task is to organize what we have done in a way that makes it more general, hiding the details while keeping the essential elements visible. In other words, we want the optimizer to appear as a “magic box” that even users who are not familiar with math modeling can use to solve their TSP problems intuitively.

4.1. TravelingSalesmanOptimizer design

Our optimizer class will have both “core” methods, which do the majority of the work, and “superficial” methods, which serve as the high-level interface of the class and invoke the core methods underneath.

The following steps form the core logic of the optimizer:

1. Create a Pyomo model using a distance matrix. This is accomplished by the _create_model method, which wraps the code we have already developed. It takes a dataframe of a distance matrix as input and builds a Pyomo model based on it. The only significant difference between what we have done and what we are doing now is that the initial site is no longer hard-coded as “hotel”. Instead, it is assumed to be the site of the first row in the df_distances dataframe. In the general case, the initial site is taken to be the first site in the coordinates dataframe (df_sites). This generalization allows the optimizer to solve any instance of the TSP.

2. Attempt to solve the model. This is done in the _optimize method, which is inherited from the BaseOptimizer class. It returns True only if a solution is found.

3. Extract the solution from the model and parse it in a way that is easy to interpret and use. This is done inside the _store_solution_from_model method, which inspects the solved model and extracts the values of the decision variables (tour_) and the value of the objective function (tour_distance_) to create the corresponding attributes. This method is only invoked if a solution exists. If no solution is found, the “solution attributes” tour_ and tour_distance_ are never created. This is beneficial because the presence of these attributes after fitting informs the user of the existence of a solution. Additionally, the optimal values of the variables and objective can be conveniently retrieved at any point, not just at the moment of fitting.

The last two steps (finding and extracting the solution) are wrapped inside the final core method, _fit_to_distances.

“But wait,” you might think, “doesn’t _fit_to_distances require distances as input? Isn’t our goal to solve TSP problems using only coordinates, not distances?” Yes, that’s where the fit method comes in. We pass coordinates to it, and it uses the GeoAnalyzer to construct the distance matrix. This distance matrix is then processed by _fit_to_distances. This way, if the user doesn’t want to calculate the distances themselves, they can delegate the task by using the fit method. If they prefer to use custom data, they can assemble it in a df_distances dataframe and pass it to _fit_to_distances instead.

4.2. TravelingSalesmanOptimizer implementation

Let’s follow the design outlined above to gradually build the optimizer. First, we’ll create a minimalist version that only builds a model and solves it, without any solution parsing yet. The __repr__ method allows us to see the name and number of sites the optimizer contains when we print it.

“`html
from typing import Tuple, List


class TravelingSalesmanOptimizer(BaseOptimizer):

    """

    Implements the Miller–Tucker–Zemlin formulation [1] of the Traveling Salesman Problem (TSP) as a linear integer program.

    The TSP can be stated like: "Given a set of locations (and usually their pair-wise distances), find the tour of minimal distance that traverses all of them exactly once and ends at the same location it started from.

    For a derivation of the mathematical model, see [2].
    Parameters

    ----------

    name : str

        Optional name to give to a particular TSP instance
    Attributes

    ----------

    tour_ : list

        List of locations sorted in visit order, obtained after fitting. To avoid duplicity, the last site in the list is not the initial one, but the last one before closing the tour.

    tour_distance_ : float

        Total distance of the optimal tour, obtained after fitting.
    Example

    --------

    >>> tsp = TravelingSalesmanOptimizer()

    >>> tsp.fit(df_sites)  # fit to a dataframe of geo-coordinates

    >>> tsp.tour_  # list of sites sorted by visit order
    References

    ----------

    [1] https://en.wikipedia.org/wiki/Travelling_salesman_problem

    [2] https://towardsdatascience.com/plan-optimal-trips-automatically-with-python-and-operations-research-models-part-2-fc7ee8198b6c

    """

    def __init__(self, name=""):

        super().__init__()

        self.name = name
    def _create_model(self, df_distances: pd.DataFrame) -> pyo.ConcreteModel:

        """

        Given a pandas dataframe of a distance matrix, create a Pyomo model of the TSP and populate it with that distance data

        """

        model = pyo.ConcreteModel(self.name)

        # a site has to be picked as the "initial" one, doesn't matter which really; by lack of better criteria, take first site in dataframe as the initial one

        model.initial_site = df_distances.iloc[0].name
        #=========== sets declaration ===========

        # list_of_sites = df_distances.index.tolist()

        model.sites = pyo.Set(initialize=list_of_sites, domain=pyo.Any, doc="set of all sites to be visited (𝕊)")

...
“`

Let’s quickly check how the optimizer behaves. When instantiated, the optimizer doesn’t contain any sites, as shown by the representation string, and it doesn’t have an internal model. It is not fitted yet.

“`html
tsp = TravelingSalesmanOptimizer("trial 1") print(tsp) # Output: TravelingSalesmanOptimizer(trial 1, n=0) print(tsp.is_model_created, tsp.is_fitted) # Output: (False, False)
“`

We can fit it to the distance data and if we don’t get a warning, it means everything went well. Now, the representation string tells us that we provided 9 sites, there’s an internal model, and

Source link