GeoFile is a series dedicated to looking at geographical data, its features and uses. In this article, we build upon our last discussion of getting the distance between two points, and now move to getting all data points within a specified radius. On this journey, we take you from indexing your PostgreSQL database to setting up queries with
In the last GeoFile article, we looked at the differences between
ST_Distance in Redis and PostGIS and why you might use Redis over PostGIS depending on the type of data you’re using and the distance you’re covering between points.
However, if you’re not interested in finding the distance between two points, but want to find all the locations surrounding your point of origin, then there are various functions and commands in Redis and PostGIS that will help you. We’ve covered Redis’s
GEORADIUS here, so in this installment of GeoFile we’ll focus on PostGIS.
ST_DWithin in PostGIS
PostGIS allows you to search for locations within specified distances using the
ST_DWithin function. This function allows you to find locations near each other or to find out if a location is within a certain distance of others.
Using a Starbucks dataset, all of the Starbucks locations in the United States were inserted into a Compose PostgreSQL database called Starbucks and a table called stores was created. Since the data only has longitude and latitude coordinates, we’ll have to create a geometry column labelled
geom. To do this, we can use the PostGIS function
AddGeometryColumn like this:
SELECT AddGeometryColumn('stores', 'geom', 4326, 'POINT', 2);
All the function is doing is adding a geometry column to the stores table and calling it
geom. We set the column SRID to 4326 and indicate that we are setting coordinates to "POINTS", not other geometrical shapes, and those points are set to "2" meaning they are located in a 2D space. Now, we’ll add the geometry data to the column with the following SQL query that will create geometry data from the longitude and latitude points from our table and set them to SRID to 4326:
UPDATE stores SET geom = ST_SetSRID(ST_MakePoint(longitude, latitude), 4326);
We’ll use a geometry over a geography data type because we will confine our search to Seattle. If we were to get all the Starbucks stores from a larger space, such as a couple states, or over an entire continent, using a geography data point would be better. Here is a good discussion of the pros and cons of using geometry and geography data types.
Remember to also create an index on the
geom column for faster lookups like:
CREATE INDEX idx_starbucks_stores ON stores USING gist(geom);
After we’ve stored our data with the geometry data, we can created the VIEW below that narrowed our query to select only the Starbucks located in Seattle so we don’t have to continuously look up Seattle stores.
CREATE VIEW Seattle_Starbucks AS SELECT name, address, city, state, zip, lat, lon, id, geom FROM Stores WHERE city = ‘Seattle’;
Looking on a map using OpenJUMP or QGIS, we'll get the following points. For the map, I used the OpenStreetMaps Seattle shapefile from Mapzen.
That gave us 141 Starbucks locations throughout Seattle. Now, we can find the closest Starbucks to any location we want. I chose a random point (-122.325959 longitude, 47.625138 latitude) in a residential area. Now, to get the Starbucks locations surrounding that point, we’d use
SELECT id, name, address, geom FROM Seattle_Starbucks WHERE ST_DWithin(geom, ST_MakePoint(-122.325959,47.625138)::geography, 1000);
This query asks for the locations of Starbucks within 1 kilometer of our point. We select the
geom column from our
Seattle_Starbucks view to find stores from our point of origin. We use the
ST_MakePoint function to define the point of origin, and then pass in the number of meters we want our search to expand to. You'll notice that I've cast the point of origin to a geography type. That's because of how
ST_DWithin works with geometry types. If you use the geometry type, the function thinks that you want all of the locations within 1000 degrees of your origin. This makes no sense and you'll get all of your locations back. However, casting to a geography allows you to set the distance to 1000 meters, or one kilometer. The result of this query gives us the location but not in any particular order:
id | name | address -------+-------------------------------+------------------------------ 10481 | Metropolitan Park | 1730 Minor Avenue Suite 102 10493 | E. Olive Way | 1600 E Olive Wy 10497 | QFC-Seattle/Broadway Mkt #887 | 417 Broadway E 10499 | Capitol Hill | 434 Broadway Ave E 10506 | Roy St Coffee & Tea | 700 Broadway E
On our map below, we can see our point of origin as a red point and our results as green points.
To really visualize what a kilometer looks like, you can use the
ST_Buffer function that will surround a point or any geometry with a user defined buffer zone. Below, we are creating the buffer around our point of origin that’s one kilometer in diameter like:
SELECT ST_Buffer( ST_MakePoint(-122.325959,47.625138)::geography, 1000)::geometry;
The result of the query confirms to us what the previous query selected, which is visible below on the map.
To order these results by distance from the point of origin, we’d set up the query using
ST_DWithin and order the results with
ST_Distance, which we’ve covered in the previous GeoFile article on distances.
SELECT id, name, address, ST_Distance(geom, ref_geom) AS distance FROM Seattle_Starbucks CROSS JOIN (SELECT ST_MakePoint(-122.325959, 47.625138)::geography AS ref_geom) AS r WHERE ST_DWithin(geom, ref_geom, 1000) ORDER BY ST_Distance(geom, ref_geom);
This will give you the following Starbucks’s in order of nearest to farthest from your point of origin in meters.
id | name | address | distance -------+-------------------------------+------------------------------+--------------- 10503 | Roy St Coffee & Tea | 700 Broadway E | 368.879406565 10497 | Capitol Hill | 434 Broadway Ave E | 457.371780075 10495 | QFC-Seattle Broadway Mkt #887 | 417 Broadway E | 483.775129907 10491 | E. Olive Way | 1600 E Olive Wy | 644.775281102 10482 | Metropolitan Park | 1730 Minor Avenue Suite 102 | 965.297144038
Summing It Up
We've looked at the power of the
ST_DWithin function as well as a couple others that will allow you to find locations within a set diameter. You probably use a function like this every day in your GIS application when searching for restaurants or looking for entertainment near your location. There's a lot more you can do with PostGIS, and we'll expose you to much more in our GeoFile series.
Next up, we’ll turn to GeoJSON and look at some geospatial queries in Elasticsearch and MongoDB.