Working with maps-based applications is interesting on a number of levels. One of the interesting aspects is the notion of Location versus the notion Locality.
Location is easy and straightforward. Most things you can put on a map occupy a single, discrete point. If you are skeptical at all about the sheer quantity of things that fall into this category, go read up on the GoogleMapsMania blog.
Anyway, lots of things have location. If you have an address but don't know the latitude and longitude, you can find the latitude/longitude easily through the process of geocoding. Geocoding (at least in the united states and Canada) is fast, reliable, and free. Once you've geocoded an address, you have a single, discrete latitude and longitude which you can then plot on your map.
Location in your schema
As you design your schema for a map-based app, the location aspect will typically manifest itself as a couple of columns in your database, for example in your places table:
What about locality?
Locality is a little more complex. Locality deals with how you will organize the geo-spatial information in your app, and how you will allow users to navigate through it. Here's an example: I live in San Francisco. Geographically, we think of our area as the San Francisco Bay Area, which comprises a lot of cities -- Oakland and Berkeley on the east, Tiburon and Sausalito to the north, San Mateo and Palo Alto to the south, and so on. Going a little more granular, our area as broken up into the "North Bay," "Peninsula," "East Bay" etc. So, if we are looking for ways to let people navigate through our spatial data in terms that are familiar and useful, these are the labels we would like to give people: "North Bay," "Peninsula," and so on.
There's the problem, and the reason that locality starts to look complicated. There's no good programatic way to place locations into these familiar, human-oriented notions of locality. When you geocode an address, Google's Geocoder never returns the result, "Yeah, its smack dab in the middle of the peninsula."
And that's the crux of the difference between Location and Locality: Location is a precise, computer-oriented thing. Whether it's represented by an address or a latititude/longitude, there's ultimately no ambiguity about where it is. Locality, on the other hand, is a human-oriented notion. It's based on how people in a region thing about the geographic areas around them, and it's not easy to deal with in a programatic way.
Furthermore, Locality changes with the scope of your data. If you only have data in the city of San Francisco, then individual neighborhoods may be the relevant language of locality. If you have data throughout the Bay Area, the language is different -- North Bay, Peninsula, etc. If your data extends throughout California, or the whole United States, then the notions (and terminology) of Locality change again.
Locality in your Schema
When you're designing your schema, it's easy to mix up location and locality. For example, you could add a city_name (string) to your places table, and it feels like you're just extracting an existing piece of data (the city name from the geocoding process, which you already have in the address field anyway) for easier reference. In fact, you're probably stepping towards locality at this point: the next step is to denormalize the city_name into its own table, and refer to instances of city though a city_id in your places table. In doing so, you just created a city-centric navigational paradigm for your app.
Now there's nothing wrong with a city-centric navigational paradigm. It works fine for lots of apps. I've found, however, that being aware of location vs. locality as separate concepts has cleared up a lot of confusion I had with my own location-based schemas. So, if I two tables, places and cities:
The first three columns in places represent location. The city_id column, and the cities table to which it refers, represent locality.
Locality beyond city-centric
One of the problems with city-centric navigation is that if two cities are butted right up against one another (as is common in metro areas), a search for places in one city might not show a place in a nearby city, even though the actual geographic distance is small (say, half a mile).
My approach to the locality problem has been to provide city-based navigation, but also provide links to nearby cities (ordered by distance) to let users navigate in a semi-spatially matter when needed.
Some other approaches are:
- organize cities into more human-centric groups where appropriate. An example is to group San Jose, Santa Clara, and Sunnyvale together as "South Bay" cities. If you only have a few cities, you might be able to do this manually. If you have more than a few cities, or if users can create cities on the fly, you'll have to figure out some other way of establishing such groupings. You might be able to do it programatically (through a clustering algorithm), or in web 2.0 style (by asking users to tag according to metro area). Both approaches have challenges, and if you come up with a good way of doing this, let me know.
- punt on locality by ONLY providing a maps-based interface, and having users enter a zip code or address to jump to different areas on the map. A lot of mashups use this approach, but you'll run into limitations pretty quickly.
- dig into the "official" data available for this. The U.S. Office of Management and Budget (OMB) defines "Metropolitan and micropolitan statistical areas," and makes the data available for download: http://www.census.gov/population/www/estimates/metrodef.html. Can this be parsed into something which is useful for navigation through a geo-spatial dataset? I don't know if the data corresponds well enough to the way people think about areas to make it worthwhile. If you've worked with this data, drop me a note.
- location and locality are two different things. Location is precise, locality is human-oriented.
- cities are a natural starting point for working with locality.
- the notion of locality in your application will vary depending upon the breadth and density of your data. Your scale of locality might be neighborhoods in New York, or it might be countries in Europe.
- there are several options for providing navigation though geo-spatial data, ranging from simple (provide links to nearby cities, ordered by distance) to complex (dig into government data on metropolitan areas)