Zip codes are funny things. There are close to 42,000 of them, and when you start plotting them on a map, you realize there's a certain method to how the numbers are set up, such that you can slice the country into ever-smaller chunks depending on how you group Zip codes together.
Computer scientists and cartography nerds have had some fun with Zip codes over the years. Ben Fry's ZipDecode project zooms you in on different regions of the country depending on what numbers you type in, for instance.
Some years ago, Robert Kosara, a research scientist at data visualization software company Tableau, had a question: What would it look like if you drew a single line through all Zip codes in the lower 48 in numeric order? Kosara wrote some code and let it rip, and what he ended up with was a map that clearly delineated state boundaries and gave a reasonable approximation of population density to boot. Since it looked as though it were created by scribbling in arbitrary regions of a U.S. map, he dubbed it the ZIPScribble map.
Kosara ran some calculations and discovered that if you started at the lowest-numbered Zip code (00544, Holtsville, NY) and walk through every Zip code in the continental U.S. in numeric order all the way up to the highest-numbered Zip code (99403, Clarkston, WA), the path you'd need to take would be roughly 1,155,268 miles long. Which naturally brought up a second question: What would be the shortest route you could take through all 37,000 of those zip codes?
This type of problem actually has a storied history in computer science. It's known as the Traveling Salesman Problem: Say a salesman has a bunch of cities in his route — what's the shortest trip he can take through all of them? This type of computation is used as a benchmark in computer science because it has a lot of applications, from route-finding to the creation of circuit boards, and because it gets complicated really, really quickly. For instance, a network of only 20 points contains roughly 1.2 quintillion (1,200,000,000,000,000,000) possible solutions, only one of which can be the shortest. That's on the order of magnitude of the number of grains of sand on earth.
What, then, of a traveling salesman problem with more than 37,000 points?
Kosara took a crack at it. He called it the Traveling Presidential Candidate Problem, after a hypothetical presidential candidate who wanted to visit all 37,000 contiguous Zip codes to clinch the nomination.
Rather than try to solve the problem directly, he tried to approximate a "pretty good" solution — not finding the absolute best path among all Zip codes, but one that was pretty close. He ended up with a solution that was about 75 percent optimal, or roughly 18 percent longer than the absolute shortest possible route would be. Here's what that looked like:
That cut the total distance traveled between all points down from 1,155,268 miles to 254,886 miles — a 4.6-fold reduction, which is pretty good!
But not great. Computing has advanced quite a bit in the 15 years since Kosara first solved the Traveling Presidential Candidate Problem, and some researchers have come up with algorithms capable of solving Traveling Salesman problems with close to 100,000 points.
So this week Kosara redid his Traveling Presidential Candidate map using one of these newer algorithms (and lucky for us, just in time for summer campaign season). Shaving some 40,000 miles off his previous best for a total of 214,916 miles, the new solution is only 0.006 percent longer than a theoretical optimum solution would be. Here's what it looks like:
While the old version optimized the path by forcing it through a rather chunky grid, the new version follows a more fluid line. It hews more closely to the denser Zip codes near population centers. Here, for instance, is the Acela corridor in the Northeast:
And here's what Los Angeles looks like — note the jog out to Catalina Island:
You can read more about Kosara's process to create the map, play with an interactive version of it and even download all the code for yourself. In the meantime, don't hold your breath for either of this year's actual presidential candidates to visit all 37,000 Zip codes in the lower 48.