Walking Path Optimization

  • by

During the pandemic, I took the opportunity to take regular walks in my neighborhood. I found this a great time to unwind and also explore the area I live in. After a few weeks, the routes began to be repetitive. I had a certain range of distances I liked to walk (1 to 2 miles at a time), and I liked to walk in loops so that I wasn’t retracing my steps the whole time. I was looking for a way to plan these loops ahead of time and none of the tools I could find online met my requirements. Therefore, I wrote my own walking path generator.


Right off the bat, I identified this as a graph network problem. A street map is at its core, a set of nodes (intersections) and edges (the roads that connect them). A solution path would have the following requirements:

  • A cycle
  • Total path length within a given minimum and maximum
  • Start and end at my house (since it’s a cycle, just has to intersect with the node nearest my house)

To download the graph, I used the osmnx python library. I limited the size to returning only nodes that were with in the max_path / 2 diameter, since that should be expansive enough to include even paths that went straight away from the starting point and then straight back. Anything further away would not have a possible cycle that met the required conditions. For privacy sake, I used the address of MA France, a bakery I like to visit as the starting location, instead of my actual home address. The initial resulting graph looks like this, with the starting point shown in red.

Finding Cycles through the Origin Node

Since I knew that the origin node (closest to the starting address) would need to be included in each of the output paths, I started there. Creating a tree of connected nodes with a breadth-first search. Now I can keep track of the shortest path from the origin point to any node in the graph. Finally, I can loop over all nodes, lookup the distance to that node, and see the distance to any of its neighboring nodes plus the path to get there falls within my cycle length requirement. In pseudocode it looks something like:

For all nodes N:
   Look up distance from origin to node N: dist_to_N.
   Loop over neighbors of node N, M:
       Loop up distance from origin to node M: dist_to_M
           cycle_length = dist_to_N + dist_to_M + dist_N_to_M
           If cycle_length meets path length requirements, save.

This operation does indeed generate cycles that fall within the distance requirements! However, it generates a lot of very low quality results.

For example, there is no constraint to find cycles that minimized overlap. So a number of paths looked like this, with long skinny paths that double back on themselves.

It’s also worth noting that this generated just over a thousand possible routes between 1 and 2 miles long. Technically they aren’t unique routes because you could find the same path going one direction and going the other direction, so in reality there are half as many undirected routes as were generated, so about 500 possible low-quality cycles within the distance constraints.

Finding cycles that don’t double back on themselves

A simple solution to avoiding switchbacks would be to set a hard requirement that the two paths that are being joined together to make the cycle do not have any nodes in common. The problem with this solution is that there may exist no cycles that do not have at least some overlap. Imagine the scenario where you live in a cul-de-sac at the dead end of a street. Necessarily there will be some overlap in any cycles. So we will need to use an overlap heuristic and look for the paths with the minimum or near-minimum overlap.

Implementation detail: It’s at this point where I changed from keeping track of a possible solution as a list of nodes, and switched to using a small class. This means I can easily add properties, such as overall path length and percentage of self-overlap, which I can then use to rank routes to keep only the best ones.

This results in much better paths! Here are a handful of the routes with the least amount of overlap:

Now we are starting to be in business!

These routes are great if you just want to find a single loop to take. If you want to pick a collection of routes, doing a new one everyday (like me), then it starts to matter if the routes are pretty redundant. For example take a look at these routes which all look very similar, but are in fact slightly different:

Finding a set of cycles that look different from each other

To get the best collection of routes, we need another heuristic which will measure overlap between two different routes. Then the best set of N routes would be the collection which minimizes the total sum of pairwise overlaps over the N routes.

Since picking the right N number of routes depends on the geometry of the roads, I opted to instead use a threshold which set the maximum amount of overlap that would be allowed with another route in the collection. So if the house was in an area with a bunch of very different possible routes, they would all show up.

With this last addition, we are now ready to generate the best walking routes. The specifications I used are as follows:

  • Total path length between 1 and 2 miles long
  • Only 30% of the path may loop back on itself
  • Each route can only overlap up to 50% of its edges with all other routes (i.e. at least half of the route must be on previously unseen roads)

These parameters resulted in the following 11 routes:

Note: The routes that look like they double back on themselves are because there is a divided road, so it’s showing going along one side and then crossing the street and coming back on the other. The algorithm is performing as expected.

Time to go take a walk!