Hey, want to sort your query by geographic distance?
Nov 14 by
Andre
Sure you do. Here's a sample Rails action which finds the ten closest cities to the latitude/longitude you provide. Obviously, this presumes your City model has the fields latitude and longitude. These examples are for MySQL 5.
def nearby_citiesThe key thing is the distance_sql, which does all the heavy lifting. You can take this basic formula and use it verbatim for any model with latitude/longitude columns -- just make sure you change the table_name to match your model.
lat,lng=params[:ll].split(',').collect{|e|e.to_f}
#convert to radians
lat_radians=(lat/180) * Math::PI
lng_radians=(lng/180) * Math::PI
# 3963 is the earth's radius, more or less, in miles.
# Which means that the distances you get out of this are in miles.
# Want a different measure? Kilometers=6378. Nautical miles=3444. Furlongs=29544
earths_radius = 3963
table_name=City.table_name
distance_sql=<<-SQL_END (acos(cos(#{lat_radians})*cos(#{lng_radians})*
cos(radians(#{table_name}.latitude))*cos(radians(#{table_name}.longitude)) +
cos(#{lat_radians})*sin(#{lng_radians})*cos(radians(#{table_name}.latitude))*
sin(radians(#{table_name}.longitude)) +
sin(#{lat_radians})*sin(radians(#{table_name}.latitude))) * #{earths_radius})
SQL_END
cities = City.find(:all,
:select=>"*, #{distance_sql} as distance",
:order => 'distance asc',
:limit => 10)
render :text=>cities.collect{|c|c.attributes}.to_json
end
Find within Radius
So, what if you want to constrain the results to a given radius? No problem. This finds all cities within a 25 mile radius:cities = City.find(:all,Note that when you use distance in :conditions, you have to repeat distance_sql, whereas if you use distance in your order clause, you can simply reference the alias ("distance") which you gave the computed column in the select clause.
:select=>"*, #{distance_sql} as distance",
:conditions=>"#{distance_sql} <= 25",
:order => 'distance asc')

Comments
Jon on Nov 14
Andre,
Are your lat/lon columns floats? How many cities do you have in your database? I'm interested in using this on a table that would have 10's of thousands of records, and wonder about how (in)efficient this query would be on that many records.
BTW, briefly previewing a comment, and then posting it triggers your flood control on the blog. That might be just a little too agressive. ;)
Andre on Nov 14
Jon, I've used this technique with the columns as either floats or numeric with a longer precision. My production environment only has ~1,100 points though.
With a large dataset, you'll want to restrict the query by a bounding box, so the DB has a smaller set of points to do the heavy lifting on. The initial bounding operation works directly on indexes you've put on the lat/lng columns, so it's fast (in theory).
The next step (which I haven't done yet) is to get some query analyzer tools to ensure that MySQL is optimizing the query execution correctly in this scenario -- i.e., it's first restricting the result set to the bounding box, and then doing the more expensive distance calculations on the smaller set of points.
BTW, you can also check out MySQL's geospatial plugin -- http://dev.mysql.com/doc/refman/5.0/en/gis-introduction.html -- but I found this to be a dead end, with some of the key documented features actually missing in implementation!
Sean on Nov 15
Sorry, my trig is rusty. Does this take into account the fact that distances along latitudes become smaller the further away they are from the equator?
Andre on Nov 15
Sean, yes it does. Otherwise it would be straight-up pythagorean all the way.
Jon on Nov 18
FYI, I set it up and ran some tests.
For my test the lat/lon columns are decimals with :precision => 10, :scale => 7, and an index was added on these two columns.
On a 2Ghz 64bit CPU with 1M cache and over 150,000 test points distributed all over the globe in the table, querying the nearest 100 points to any arbitrary point takes less than a second. Note: that this is *without* any bounding box. When I increase the limit to 10,000 the query slows to a crawl, taking a whopping one and a half.... seconds... or so to complete. ;) Since I'll only every be intested in the nearest 20 or so, I think I can live with that performance.
In case anyone's interested in an extreme example, when I set the limit to 145,000, the query takes a little over 14 seconds. Again, no bounding box used. Clearly, the number of points requested in the query has more effect on the performace than the number of points in the table being queried.
Andre on Nov 20
Jon: Awesome, thanks for posting the results of your tests. If you want, drop me an email and let me know what application you're working on!
Johannes de Jong on Dec 08
Thanks for the code Andre. I will use it in a application I'm building.
Bill Eisenhauer on Jan 07
FYI, the Earth's radius as reported in the post has a transposition. It should be 3963.
David Yeo on Jul 24
I am just wondering if search results will be more efficiently handled by having Javascript perform the "lat/lng tests within a specified radius" after loading the set of coordinates from the table. In doing so, the computation will be shared by each individual browser using local computing power as opposed to letting single server (mysql app) service the requests. The situation may run dire if say there are growing number of visitors to the site.
What do you think?
w on Apr 14
these days, mysql geospatial is probably a better idea since it includes caching and most 'display geographic data on the web' applications are going to be always sending similar queries ('all points in x city').