GeoFile: Everything in the Radius with PostGIS

GeoFile is a series dedicated to looking at geographical data, its features and uses. In this article, we build upon our last discussion of getting the distance between two points, and now move to getting all data points within a specified radius. On this journey, we take you from indexing your PostgreSQL database to setting up queries with ST_DWithin.

In the last GeoFile article, we looked at the differences between GEODIST and ST_Distance in Redis and PostGIS and why you might use Redis over PostGIS depending on the type of data you’re using and the distance you’re covering between points.

However, if you’re not interested in finding the distance between two points, but want to find all the locations surrounding your point of origin, then there are various functions and commands in Redis and PostGIS that will help you. We’ve covered Redis’s GEORADIUS here, so in this installment of GeoFile we’ll focus on PostGIS.

ST_DWithin in PostGIS

PostGIS allows you to search for locations within specified distances using the ST_DWithin function. This function allows you to find locations near each other or to find out if a location is within a certain distance of others.

Using a Starbucks dataset, all of the Starbucks locations in the United States were inserted into a Compose PostgreSQL database called Starbucks and a table called stores was created. Since the data only has longitude and latitude coordinates, we’ll have to create a geometry column labelled geom. To do this, we can use the PostGIS function AddGeometryColumn like this:

SELECT AddGeometryColumn('stores', 'geom', 4326, 'POINT', 2);  

All the function is doing is adding a geometry column to the stores table and calling it geom. We set the column SRID to 4326 and indicate that we are setting coordinates to "POINTS", not other geometrical shapes, and those points are set to "2" meaning they are located in a 2D space. Now, we’ll add the geometry data to the column with the following SQL query that will create geometry data from the longitude and latitude points from our table and set them to SRID to 4326:

UPDATE stores SET geom = ST_SetSRID(ST_MakePoint(longitude, latitude), 4326);  

We’ll use a geometry over a geography data type because we will confine our search to Seattle. If we were to get all the Starbucks stores from a larger space, such as a couple states, or over an entire continent, using a geography data point would be better. Here is a good discussion of the pros and cons of using geometry and geography data types.

Remember to also create an index on the geom column for faster lookups like:

CREATE INDEX idx_starbucks_stores ON stores USING gist(geom);  

After we’ve stored our data with the geometry data, we can created the VIEW below that narrowed our query to select only the Starbucks located in Seattle so we don’t have to continuously look up Seattle stores.

CREATE VIEW Seattle_Starbucks AS  
    SELECT name, address, city, state, zip, lat, lon, id, geom
    FROM Stores
    WHERE city = ‘Seattle’;

Looking on a map using OpenJUMP or QGIS, we'll get the following points. For the map, I used the OpenStreetMaps Seattle shapefile from Mapzen.

That gave us 141 Starbucks locations throughout Seattle. Now, we can find the closest Starbucks to any location we want. I chose a random point (-122.325959 longitude, 47.625138 latitude) in a residential area. Now, to get the Starbucks locations surrounding that point, we’d use ST_DWithin like:

SELECT id, name, address, geom  
FROM Seattle_Starbucks  
WHERE ST_DWithin(geom, ST_MakePoint(-122.325959,47.625138)::geography, 1000);  

This query asks for the locations of Starbucks within 1 kilometer of our point. We select the geom column from our Seattle_Starbucks view to find stores from our point of origin. We use the ST_MakePoint function to define the point of origin, and then pass in the number of meters we want our search to expand to. You'll notice that I've cast the point of origin to a geography type. That's because of how ST_DWithin works with geometry types. If you use the geometry type, the function thinks that you want all of the locations within 1000 degrees of your origin. This makes no sense and you'll get all of your locations back. However, casting to a geography allows you to set the distance to 1000 meters, or one kilometer. The result of this query gives us the location but not in any particular order:

 id   |             name              |           address            
-------+-------------------------------+------------------------------
 10481 | Metropolitan Park             | 1730 Minor Avenue  Suite 102
 10493 | E. Olive Way                  | 1600 E Olive Wy
 10497 | QFC-Seattle/Broadway Mkt #887 | 417 Broadway E
 10499 | Capitol Hill                  | 434 Broadway Ave E
 10506 | Roy St Coffee & Tea           | 700 Broadway E

On our map below, we can see our point of origin as a red point and our results as green points.

To really visualize what a kilometer looks like, you can use the ST_Buffer function that will surround a point or any geometry with a user defined buffer zone. Below, we are creating the buffer around our point of origin that’s one kilometer in diameter like:

SELECT ST_Buffer(  
    ST_MakePoint(-122.325959,47.625138)::geography, 
    1000)::geometry;

The result of the query confirms to us what the previous query selected, which is visible below on the map.

To order these results by distance from the point of origin, we’d set up the query using ST_DWithin and order the results with ST_Distance, which we’ve covered in the previous GeoFile article on distances.

SELECT id, name, address, ST_Distance(geom, ref_geom) AS distance  
FROM Seattle_Starbucks  
CROSS JOIN (SELECT ST_MakePoint(-122.325959, 47.625138)::geography AS ref_geom) AS r  
WHERE ST_DWithin(geom, ref_geom, 1000)  
ORDER BY ST_Distance(geom, ref_geom);  

This will give you the following Starbucks’s in order of nearest to farthest from your point of origin in meters.

id   |             name                |           address            |   distance  
-------+-------------------------------+------------------------------+---------------
 10503 | Roy St Coffee & Tea           | 700 Broadway E               | 368.879406565
 10497 | Capitol Hill                  | 434 Broadway Ave E           | 457.371780075
 10495 | QFC-Seattle Broadway Mkt #887 | 417 Broadway E               | 483.775129907
 10491 | E. Olive Way                  | 1600 E Olive Wy              | 644.775281102
 10482 | Metropolitan Park             | 1730 Minor Avenue Suite 102  | 965.297144038

Summing It Up

We've looked at the power of the ST_DWithin function as well as a couple others that will allow you to find locations within a set diameter. You probably use a function like this every day in your GIS application when searching for restaurants or looking for entertainment near your location. There's a lot more you can do with PostGIS, and we'll expose you to much more in our GeoFile series.

Next up, we’ll turn to GeoJSON and look at some geospatial queries in Elasticsearch and MongoDB.