Population centers

About this project

Let's say you're an employee at a major theme park company, and you're about to launch a new kind of theme park that everyone in the US is going to want to visit. Where can you build it that minimizes the root mean squared distance for every person in the US? What if you build 2 or 3 copies of the theme park?

Here's a map of the US where every county has a circle whose area proportional to its population:

For minimizing the squared distance for each person, if you can only build in one location you should choose Hodgeman County, Kansas, a county with just under 2000 people. Unsurprisingly it splits the difference between the west coast and the east coast, and is not too far from Texas and Chicago.

If you can build two identical theme parks, you're going to want to build in Nye County, Nevada and White County, Tennessee. The Tennessee location kind of makes sense (sort of the center of New York City, Chicago, Texas, and Florida?), but I was surprised the Nevada location wasn't closer to Los Angeles. Maybe Seattle and San Francisco pull it further north? Interestingly, if you look at how many people would be closer to each location, the Tennessee location wins by a wide margin, around 243 million to 77 million!

If you have the cash for three identical theme parks, one should still be in Nye County, Nevada (neat!), and the other two should be in Trumbull County, Ohio and St. Bernard Parish, Louisiana. The Ohio location is basically halfway between New York City and Chicago, and the Louisiana one does a decent job of covering Texas and Florida. Here the Ohio location is closer to the most people, around 156 million as compared to 89 million for the Louisiana location and 76 million for the Nevada location.

About this project: Find the code (with more technical discussion) at GitHub, and see some non-technical discussion here.
There are a lot of assumptions here. For one thing, we're assuming every person in a county lives at the geographic center of the county - probably wouldn't have a huge effect on the numbers, but who knows. For another thing, we're only considering these geographic centers as candidates for locations.

I considered using some kind of custom distance function that acknowledged that once you get more than, say, 200 miles away, it's a big jump because it's probably an all-day trip to get there. Once you get over, say 300 miles, most people are probably travelling by plane to get there, so it's not too much harder to go a bit further.

Obviously if you were doing this for real, you'd care a lot about available land and close highways and big airports!


Here are the results if you minimize plain distance instead of squared distance - you can see not too much changes:

For one location, the result is Stevens County, Kansas.

For two locations, the results are San Bernardino County, California and Russell County, Kentucky. The interesting part is that the Nevada location (in the squared case) has moved to be right next to Los Angeles. My guess is that there are many people there that it's now worth making Seattle be a bit further away, since the distance is no longer squared.

For three locations, the results are San Bernardino County, California, Livingston Parish, Louisiana, and Lawrence County, Pennsylvania. Again, very similar to the squared case except for the Nevada location moving to California.