Elasticsearch Security Update - Groovy Scripting Dropped

Rocco the Security Guard

Last month, Elasticsearch disclosed a security vulnerability in the database's Groovy dynamic scripting and the sandbox designed keep dynamically loaded remote scripts under control. The fix for this vulnerability involves permanently disabling Groovy script which we know is a feature that some of our Elasticsearch users take advantage of.

Since Elasticsearch 1.3, sandboxed languages have been available for dynamic scripting. We allowed this scripting to run because Compose users exposure to malicious exploitation of the vulnerability is reduced. All our deployments run on a VPN which requires either authentication over HTTPS or SSH credentials so it was less likely that a malicious third party query could make it into the system.

We decided to use the time that afforded us to see if we could provide sufficient added mitigation to users to give them more time to migrate from using Groovy scripting. But, over the weekend an exploit for the vulnerability appeared in the wild, so we have decided immediately institute an upgrade to all users to the latest Elasticsearch versions. The exploit is still mitigated against by our use of authentication with HTTPS or SSH credentials, but the change in risk profile means we need to move now to keep your data and our systems in the most secure state possible.

What's happening

We are currently contacting all users to tell them when we plan to automatically upgrade all their Elasticsearch instances from 1.3.x and 1.4.x to 1.4.4. If you expect to be unaffected by the disabling of Groovy Dynamic Scripting you can either wait for the proposed update time or take control of your update by upgrading in the Compose Dashboard.

If you do use Groovy Dynamic Scripting, then please respond to our mail as soon as possible, or preempt the mail and drop a message to support@compose.io so that we can discuss your options for the future.

What broke

Dynamic scripting allowed Groovy scripts (and other languages) to be included in Elasticsearch queries. In Elasticsearch 1.3 and later, the embedded script would be run within a sandbox on the server as it executed the query. The vulnerability meant though that Groovy scripts could escape the sandbox and run shell commands on the server.

That's not good, especially as out-of-the box, there's no control over who issues requests to an Elasticsearch server. That's why Compose gives HTTPS and SSH authenticated access options and why we run each Elasticsearch deployment on its own private VLAN - to ensure that no one gets in without permission and even if they do, to limit their access to the database only.

After talking with the Groovy developers, Elasticsearch have decided that Groovy will never be sufficiently safe in a sandbox and have removed it from the list of sandboxed languages. That leaves only the still-labelled-experimental Lucene expression scripts as an option for sandboxed dynamic scripting.

Going forward

It seems unlikely that Elasticsearch and Groovy developers could work out a way to sandbox Groovy effectively in the future. Groovy isn't being taken out of Elasticsearch though. There are some scripting options available, but they work by manipulating files outside the server process, something we don't support at Compose for reasons of deployment consistency and security.

For Compose users currently, the – albeit much more limited – Lucene Expressions are the only option, and given the record so far on dynamic scripting sandboxes, we are still reviewing whether that option will be safe over the long term. Elasticsearch is looking into extending the Expressions system – currently it doesn't support loops and can only work with numeric fields – but there's no roadmap for that development, describing it as a "long term project".