Keeping Elasticsearch Secure - the Compose Network

People have asked us "You run Elasticsearch databases, how do you secure them?". It's a good question to ask because Elasticsearch doesn't have any authentication mechanisms built in, or users or roles or anything that resembles them. Elasticsearch has been built to not be exposed to the shared corporate network, let alone the internet. It has been built to run on a private network importing data and delivering search results to systems connected to that private network.

If you are running your own local Elasticsearch, you may be able to work around that requirement or be prepared to take on the risk. Unfortunately, too many developers carry that risk over onto their cloud deployments as they scale up, leaving themselves vulnerable. We engineer our systems to embrace best practice and the model that all databases should be run on private networks, especially when they live in the cloud. So how do we do it?

From the process up

3

It starts at the process level in our Elasticsearch clusters. Each one of them is runs in an isolated container on their host system and is connected to a dedicated VLAN. Our VLAN's run on top of a software-defined network (SDN) that allows us to create the right, highly controlled and isolated, network architecture. To implement that SDN, we use Open vSwitch, an open source, production quality, multilayer virtual switch.

Each one of the Elasticsearch containers has a private interface to a VLAN which allows it to communicate with other containers within the cluster. None of them have a public interface. It's not just Elasticsearch containers on the VLAN though. We also have "entry point" containers which are designed to purely handle incoming access. These entry points have separate private and public interfaces and are responsible for managing the transit of communications between the two.

Currently we are in production with a HAProxy entry point container. More often used solely for load balancing, HAProxy can also enforce authentication on incoming HTTPS connections. Thats why, when you get your Compose Elasticsearch set up, you can create users. Those users are administered by the HAProxy entry point. Once authenticated there, the requests are forwarded to the appropriate node on the cluster. Your connection is HTTPS secured and authenticated by the HAProxy and sent on into the already private network where the cluster lives. For many uses of Elasticsearch that gives sufficient access and control.

Beyond proxies

We aren't resting yet. One of the popular uses of Elasticsearch is to use it with Logstash, the event and log management system. Logstash ships logs in from any source then uses Elasticsearch to power its parsing, indexing, timestamping and, of course, search. But it connects using Transport which is typically used for internal communications between the nodes of a cluster. It does this to be more efficient when pushing data into the search cluster. There's only one problem: that internal protocol doesn't go across our HAProxy entry point as that's built for REST protocol users.

That is why we are currently have in private beta an SSH-Tunnel entry point. This allows customers to create a secure tunnel into the private network from their own network so they can have authenticated access for the Transport protocol. While this is being tested with Elasticsearch, we are also considering it as a potential entry point for our MongoDB deployments. We are able to do that because we built the Compose platform to be reusable.

Forward and forwarding

The Compose architecture for Elasticsearch is the same architecture we have developed to run MongoDB safely and securely. Any database running in the cloud should be running on a clean, managed private network and access to that network should be tightly controlled through authenticating, encrypting, hardened entry points. It isn't a hand-configured deployment either; our networking deployment goes hand in hand with our automatic database deployment.

As we continue to develop the Compose architecture, with our current and future database offerings, we won't deviate from delivering that safe environment for your data to work in. We're already looking at future entry point containers, such as VPNs, and how we can give you more control over those entry points.