GeoFile: Everything in the Radius with MongoDB Geospatial Queries

GeoFile is a series dedicated to looking at geographical data, its features, and uses. In this article, we’ll continue our discussion of getting locations within a radius, but apply it to GeoJSON documents and MongoDB. We’ll be setting up 2dsphere indices and using MongoDB’s built-in geospatial query operators.

If you’re using GeoJSON in your next project, consider using MongoDB to store your data. MongoDB provides many geospatial query operators that enable you to create queries using GeoJSON documents quickly and efficiently.

In this article, we'll be looking at three ways to write a geospatial query that get's documents from within a specified distance of a point of origin. We'll use three datasets from Fairfax County, Virginia, that are freely available on Data.gov, the United States government’s open data repository. Finally, we'll write a small ExpressJS app that serves up the data on a website using OpenStreetMap.

First, download the following data sets: Fairfax County Border, Fairfax Country Libraries, and Fairfax Historical Sites.

After you've downloaded the datasets, insert them into your MongoDB database using either an application and your connection strings, or via the command line using the command-line string provided by the console of your Compose deployment. To import a GeoJSON dataset into your Compose deployment, you'll want to use mongoimport like this:

mongoimport --db fairfax --collection sites --file "Historic_Sites.geojson" -u username -p password --ssl --sslCAFile cert.pem --host aws-us-west-2-portal.0.dblayer.com --port 15326 --jsonArray  

To show you the results of our geospatial queries on a map, we’ll be using LeafletJS, the JavaScript library for maps, and OpenStreetMap. The library is used to take our GeoJSON documents and project them onto OpenStreetMap. To help us get the GeoJSON documents to LeafletJS, we'll be using MongoDB's NodeJS library with ExpressJS.

All of the code for this article is included in https://github.com/compose-ex/mongodb-radius.

It’s all about the index

Before we do anything, let's start by creating indices. Indices are fundamental to geospatial querying and most of the geospatial query operations we’ll be running depend on having a geospatial index set up. But, we first have to figure out whether our dataset uses a flat or spherical projection system. To determine that, look at our GeoFile article on geospatial reference systems.

The MongoDB documentation on geospatial indexing goes into depth about when to use either a 2d or 2dsphere index. However, if we’re using GeoJSON and MongoDB 3.2.x, then we should be using the 2dsphere index since the 2d index does not support GeoJSON geometry objects.

To create a geospatial index, we can connect to our MongoDB deployment and run the following command:

db.collection.createIndex({'geometry': '2dsphere'});  

Replace collection with the collection you’re creating the index on.

Once you run that, you’ll receive a confirmation that the index has been successfully created like:

mongos> db.sites.createIndex({"geometry": "2dsphere"})  
{
    "raw" : {
        "set-5817b780a5342b0016000005/10.201.200.39:27017,10.201.200.40:27017" : {
            "createdCollectionAutomatically" : false,
            "numIndexesBefore" : 1,
            "numIndexesAfter" : 2,
            "ok" : 1,
            "$gleStats" : {
                "lastOpTime" : Timestamp(1479247707, 1),
                "electionId" : ObjectId("7fffffff0000000000000001")
            }
        }
    },
    "ok" : 1
}

Making geospatial queries

Now, let’s take a look at what some of the queries and results will look like via the MongoDB shell. If you’re just starting out with GeoJSON and writing geospatial queries, the MongoDB shell is a useful tool for running test queries.

There are a number of geospatial query operators included in MongoDB, and we will not cover them all. But familiarize yourself with the different queries, how to combine operators, and whether they require geospatial indices.

GeoJSON documents

To set up our first query, we'll have to know the structure of our data. This is done by using the findOne method with a collection:

db.libraries.findOne();  

This query will provide us with a result that'll look like:

{
    "_id" : ObjectId("582122d4a98c5c6f7b955ebf"),
    "type" : "Feature",
    "properties" : {
        "OBJECTID" : 9,
        "TYPE" : "LIB",
        "DESCRIPTION" : "GEORGE MASON REGIONAL LIBRARY",
        …
        "Editor" : "FairfaxCounty"
    },
    "geometry" : {
        "type" : "Point",
        "coordinates" : [
            -77.18621789486043,
            38.82741811639861
        ]
    }
}

We’ve named collection containing historical locations as sites, which has the same GeoJSON structure as the libraries collection.

You’ll notice that each document contains type, properties, and geometry. The type: "Feature" just indicates that the document includes geometry and properties JSON objects. The properties object includes information about the geometry, while geometry includes the type of geometry, such as Point, Polygon, MultiPolygon, etc., and coordinates that can be an array containing longitude and latitude coordinates, or an array of arrays of coordinates. An example of the MultiPolygon type geometry is our border collection which looks like:

{
    "_id" : ObjectId("582a09bca98c5c6f7b964181"),
    "type" : "Feature",
    "properties" : {
        "OBJECTID" : 4,
        "NAME" : "FAIRFAX COUNTY",
        "CREATED_USER" : "FairfaxCounty",
        "CREATED_DATE" : "2016-11-12T06:02:42.655Z",
        "LAST_EDITED_USER" : "FairfaxCounty",
        "LAST_EDITED_DATE" : "2016-11-12T06:02:42.655Z"
    },
    "geometry" : {
        "type" : "MultiPolygon",
        "coordinates" : [
            [
                [
                    [
                        -77.308791849374,
                        38.8428552213637
                    ],
                    [
                        -77.3103917301592,
                        38.843354741431
                    ],
                    …
                    [
                        -77.3031623200181,
                        38.8328038878956
                    ]
                ]
            ]
        ]
    }
}

For more information about GeoJSON, here’s an article that explains everything you need to know about it in a very accessible way. If you want to take a deep dive into GeoJSON, here’s the specifications.

Point of origin

After setting up our indices, we'll need to select a point of origin to set up our radius geospatial query. To do this, we’ll select a library from our library dataset. Then, the query will be assigned a variable to store the geometry field that will be used when building our geospatial query. The query will look something like this:

var library = db.libraries.findOne({"properties.DESCRIPTION": "THOMAS JEFFERSON LIBRARY"}, {_id: 0, "geometry": 1});  

For our first query, we’ve selected a library named “THOMAS JEFFERSON LIBRARY”. The query will look through our library collection for the name in the properties.DESCRIPTION field. Since we set up a projection, once the document is found, it will return only the geometry field for the library. This is stored in the library variable:

{
    "geometry" : {
        "type" : "Point",
        "coordinates" : [
            -77.20176584039915,
            38.86543182000657
        ]
    }
}

Three ways to get documents within a radius

So now that we have a point of origin, we can now start searching for locations surrounding it. We want to make this a little interesting, so we'll be covering three ways to search for historical sites surrounding the Thomas Jefferson Library in Fairfax, Virgina.

$geoWithin and $centerSphere

We’ll start by setting up a query that uses $geoWithin, which selects documents that are located within a specified geometry.

Since we are looking for various historical sites next to our chosen library, we’ll query over the sites collection. The historical sites' geometry field will be used in our query filter so that the historical sites geometries will be compared to our library geometry.

db.sites.find({"geometry": ... })  

Next, we’ll ask the query to get all the documents within a five-mile radius.

{$geoWithin: {$centerSphere: [library.geometry.coordinates, 5/3963.2]}}

$geoWithin selects all the points within a geometry, while $centerSphere creates a circular geometry around the point of origin. It accepts an array of coordinates which are taken from our library geometry field. The second item in the array is a radian, which sets the distance from the point of origin. Since we're using $centerSphere, the query will look for locations surrounding our library's location. Since we want to look for historical sites that are at most five miles from the library, we can divide 5 by the equatorial radius of the earth, 3963.2 miles, to get the correct radian.

The full query will look like the following:

db.sites.find({"geometry":  
   {$geoWithin: 
     {$centerSphere: [library.geometry.coordinates, 5/3963.2]}
   }
  }, {_id: 0, "properties.DESCRIPTION": 1});

This gives us the names of the historical sites surrounding our library, but not sorted in any order.

{ "properties" : { "DESCRIPTION" : "HUNTER HOUSE" } }
{ "properties" : { "DESCRIPTION" : "WAKEFIELD CHAPEL" } }
{ "properties" : { "DESCRIPTION" : "GREEN SPRING FARM" } }
{ "properties" : { "DESCRIPTION" : "CLARK HOUSE" } }

$nearSphere, $geometry and $maxDistance

The second way to make the same query is to use $nearSphere. which requires a geospatial index. This query operator will produce an ordered list of historical sites from nearest to farthest from the library:

db.sites.find({"geometry":  
   {$nearSphere: 
     {$geometry: library.geometry, $maxDistance: 8046.72}
   }
  }, {_id: 0, "properties.DESCRIPTION": 1});

In this query, we’re using the $geometry operator that takes the entire GeoJSON geometry field from the library. $maxDistance is an optional measurement that’s calculated in meters and specifies the distance from the point of origin where the query should look for locations. Since we're looking for historical sites within five miles of the library in meters, we'll enter 8046.72. This will give us the same results, but ordered by nearest to farthest from our library.

{ "properties" : { "DESCRIPTION" : "CLARK HOUSE" } }
{ "properties" : { "DESCRIPTION" : "WAKEFIELD CHAPEL" } }
{ "properties" : { "DESCRIPTION" : "GREEN SPRING FARM" } }
{ "properties" : { "DESCRIPTION" : "HUNTER HOUSE" } }

$geoNear aggregation

The last way to make the query is to use MongoDB’s $geoNear aggregation. Results from $geoNear will give you GeoJSON documents from nearest to farthest from a point of origin like $nearSphere, but you can also tell the query to provide you with the distance of each historical site from the library. This is how we set up the query:

db.sites.aggregate([  
    { $geoNear: 
        { 
            near: library.geometry, 
            distanceField: "dist.calculated", 
            maxDistance: 8046.72,  
            spherical: true, 
            distanceMultiplier: 1/1609.344
        }
    },
    { $project: 
        { 
            _id: 0, 
            "properties.DESCRIPTION": 1, 
            "dist.calculated": 1 
        }
    }
]);

The options we set up are similar to what we have been using in the other queries we've covered. near includes the geometry field from our library, and spherical should be set to true since we have a 2dsphere index. distanceField provides you with an output field that contains the calculated distance of each historical site from the library, while distanceMultiplier allows you to convert radians to kilometers or miles. In this query, we convert from maxDistance in meters to miles.

The final aggregation $project allows us to select the fields that are returned by $geoNear. In this case, we’ve selected the fields that give use the name of the historical sites within the radius and their distance from the library, which produces the following:

{ "properties" : { "DESCRIPTION" : "CLARK HOUSE" }, "dist" : { "calculated" : 3.1988623437242323 } }
{ "properties" : { "DESCRIPTION" : "WAKEFIELD CHAPEL" }, "dist" : { "calculated" : 3.3132885303175272 } }
{ "properties" : { "DESCRIPTION" : "GREEN SPRING FARM" }, "dist" : { "calculated" : 3.7410665986589016 } }
{ "properties" : { "DESCRIPTION" : "HUNTER HOUSE" }, "dist" : { "calculated" : 4.156427801047941 } }

Now that we've got our queries working, we want to see how they look on a map. To do this, we'll use NodeJS.

Translating into NodeJS

Now that we’ve tested out the queries in the MongoDB shell, it’s a fairly straightforward process to add them, for example, to an ExpressJS application so that the data can be transported to LeafletJS and shown on a map in the browser.

First, make sure that NodeJS is installed then install ExpressJS and MongoDB’s NodeJS driver into your project's package.json file:

npm install express mongodb --save  

We'll then require express and mongodb and insert the connection strings provided in the Compose MongoDB console. Here's an example of how the code might look like:

const express = require('express');  
const app = express();  
const client = require('mongodb').MongoClient;  
const URI = "mongodb://username:password@aws-us-west-2-portal.0.dblayer.com:15326/fairfax?ssl=true, mongodb://username:password@aws-us-west-2-portal.1.dblayer.com:15326/fairfax?ssl=true";  
const options = {  
    mongos: {
        ssl: true,
        sslValidate: false
    }
};
const port = 8080;  

Now, let’s set up a connection pool so that we can reuse the MongoDB connection when building URI endpoints to serve our GeoJSON data.

let db;  
client.connect(URL, options, (err, database) => {  
    if (err) throw err;
    db = database;
    app.listen(port, () => {
        console.log("App listening on port %s", port);
    });
});

The set up the ExpressJS Router() middleware, which will make it easy to set up URI endpoints with HTTP method routes:

const router = express.Router();  
app.use('/api', router);  

Now that we have the basics of an ExpressJS application set up, we can start translating the MongoDB geospatial queries that we tested out into JavaScript functions. Each function below corresponds to the three methods we discussed to get historical sites within a 5-mile radius of our library.

What you'll notice is that the queries translate from the MongoDB shell to our Express application nicely. The only differences are how you define which collection to use using collection() and how you convert the output of documents to an array toArray.

$geoWithin and $centerSphere
function findNearPlacesGeoWithin(placeName, radius, collection1, collection2, res) {  
    db.collection(collection1).find({"properties.DESCRIPTION": placeName}, {_id: 0, "properties.DESCRIPTION": 1, "geometry": 1})
    .toArray((err, docs)  => {
        db.collection(collection2).find({
            "geometry": {    
                $geoWithin: {
                    $centerSphere: [docs[0].geometry.coordinates, radius/3963.2] // convert radius to miles
                }
            }
        }, {_id: 0}).toArray((err, docs) => {
            if (err) throw err;
            res.json(docs);
        });
    });
}
$nearSphere, $geometry, and $maxDistance
function findNearPlacesGeoNear(placeName, distMeters, collection1, collection2, res) {  
    db.collection(collection1).find({"properties.DESCRIPTION": placeName}, {_id: 0, "properties.DESCRIPTION": 1, "geometry": 1})
    .toArray((err, docs)  => {
        db.collection(collection2).find({
            "geometry": {
                $nearSphere: {
                    $geometry: docs[0].geometry,
                    $maxDistance: distMeters 
                }
            }
        }, {_id: 0}).toArray((err, docs) => {
            if (err) throw err;
            res.json(docs);
        });
    });
}
$geoNear aggregation
function findNearPlacesAgg(placeName, distMeters, collection1, collection2, res) {  
    db.collection(collection1).find({"properties.DESCRIPTION": placeName}, {_id: 0, "properties.DESCRIPTION": 1, "geometry": 1})
    .toArray((err, docs) => {
        db.collection(collection2).aggregate([
            {$geoNear: {
                near: docs[0].geometry,
                maxDistance: distMeters,
                spherical: true,
                distanceField: "dist.calculated",
                distanceMultiplier: 1/1609.344 // calculate distance in miles
            }},
            {$project: {
                _id: 0,
                "type": 1,
                "properties.DESCRIPTION": 1,
                "geometry": 1
            }}
        ]).toArray((err, docs) => {
            if (err) throw err;
            res.json(docs);
        });
    });
}

Using one of these functions will give us with the same results, except $geoNear will provide us with the distance of each historical site from the library. Once these functions have been made, just drop one into a custom route and get the returned GeoJSON documents that satisfy the query:

router.get('/radius', (req, res, next) => {  
    findNearPlacesAgg("THOMAS JEFFERSON LIBRARY", 8046.72, "libraries", "sites", res);
});

Using LeafletJS and OpenStreetMap, you should see the final result that looks like the following in your browser:

Summing it up

We’ve looked at the various ways to create a query using MongoDB’s geospatial query operators by finding the historical sites within a 5-mile radius of the Thomas Jefferson Library. The three ways that we covered will help you on your way to discovering how to use other MongoDB geospatial operators, and how to translate them into queries that can be used in your next NodeJS application.


If you have any feedback about this or any other Compose article, drop the Compose Articles team a line at articles@compose.com. We're happy to hear from you.