We recently sat down with CTO of ScyllaDB, Avi Kivity, to talk about Scylla, its roots, how it approaches being Cassandra compatible, and where Scylla is going from here.
Dj Walker-Morgan: Hi Avi, as you know, at Compose we are just rolling out Scylla on Compose Hosted and Enterprise.
Avi Kivity: Yes, I'm really happy that Compose is offering this way for users to use Scylla easily without the hassle of setting it up and provisioning systems. I think it's great for Compose and it's great for us -- I'm looking forward to seeing many users run on top of Compose.
Dj: For people who don't know ScyllaDB, could you give us some of the history?
Avi: All right. So we were working on a different project for improving performance of applications and one of the targets was running the Java workloads like Cassandra. We tried to measure the project's acceleration and see what we could improve but we found that Cassandra was fighting with itself most of the time and it would be very hard to accelerate because most of the inefficiency was in Cassandra itself. Coming from high-throughput and storage background we thought that we could do this very much better. So we tried it and it worked very well. We were able to achieve 10x the throughput increase on wider machines, which was more than we expected.
Cassandra and beyond
Dj: One of the things that people find notable is that you're going for very exact compatibility with Cassandra. Is that a difficult thing to do or is it something that is just coming naturally from the processes that you use to develop it?
Avi: It is not very difficult. Of course sometimes we have to reverse engineer some Cassandra code in order to determine the exact guarantees and the exact APIs, for example the SSTable data file format is not documented so we had to reverse engineer it and it is documented on our site. The client visible APIs are better documented. The CQL language syntax and the client server protocol is documented. It is quite easy to re-implement them and get the exact same behavior.
Dj: Is there anything you'd say to a Cassandra user of Scylla on why they should run on Scylla beyond just performance?
Avi: One of the things that we offer beyond performance is something that we call workload conditioning. That is a set of auto-tuning facilities that make managing a Scylla cluster much easier than managing a Cassandra cluster. If you look at the experience that people have with Cassandra, they often run into issues where they have a huge backlog of compactions that Cassandra is not able to clear, or their memory fills up causing timeouts. Instead, in Scylla we have a set of internal measurements that Scylla takes continuously.
These are available to the user but they're also used internally to close the feedback loop. So if Scylla sees, for example, that a compaction is running behind and is generating a large backlog it is able to dynamically increase compaction throughput at the expense of write throughput. This way it is able to clear the backlog or prevent it from accumulating in the first place. The administrator does not need to keep tuning and re-tuning the database to get a reasonable performance. Instead the database knows exactly which processes need extra resources and it is able to redirect those resources wherever they are needed.
We were able to do this first because we have all of these internal measurements and second because we have resource schedulers that allow us to prioritize different internal processes. So we can, if we're talking about disk IO bandwidth for example, change the fraction of the disk that goes to query, writes, compaction, repair, and other internal processes and this can be changed dynamically and automatically.
Dj: So from say Compose's point of view that makes it ideal for being a hosted database because we don't have to let people spend all their time fiddling with the controls as it's tuning itself.
Avi: It's definitely good for a service provider but it's also great for a regular user that just manages their own database. Fiddling with tuning is time consuming and you never know when you get it right. Even if you do, your workload can change day-by-day or even minute-by-minute and you don't want to keep going back and retuning it. You want your database to be able to handle changing workloads.
Dj: So high performance, self-managing; what else has Scylla got?
Avi: We have a lot of plans for the future. We plan to utilize the high performance and the self-managing ability in order to allow multi-tenancy so you could have a large physical cluster and you could allow multiple different users to carve out slices of this cluster for themselves. This way, instead of provisioning a lot of small clusters that each need to be managed separately, they could have a large physical cluster with a single administration point and request a slice of that cluster in a very similar way to what people do with virtual machines.
Another reason to use Scylla is low latency. With Cassandra, a lot of people experience problems with latency and garbage collection so from time-to-time the JVM will simply stop everything it is doing. This is called "Stop the world" and it will just do garbage collection and it will not respond to any client request. For some applications this can be very detrimental. In Scylla nothing like that happens. It is always responsive and if you have a latency sensitive application Scylla can make your life a lot easier.
Dj: Are you, in terms of actual database features, always going to be tracking Cassandra or is there anywhere that you're looking to take a lead?
Avi: Indeed we are going to do both. So we are going to track Cassandra in order to not fragment the ecosystem. We are going to make sure that database drivers and applications keep working with Cassandra or with Scylla so that users have an easy migration path but we are also going to implement our own features. Multi-tenancy is one such feature that we will implement independently from Cassandra and there will be others.
Dj: You are implementing Scylla in C++, so, just for my curiosity, what toolchain do you use?
Avi: We're using GCC but we're also looking at using Clang. It's really good now that compilers became more competitive with Clang coming to the scene. We're seeing a lot of movement in terms of the C++ language support and optimizers. We are also interested in integrating LLVM just-in-time compilation so that we can optimize the client queries on the fly. So when a query comes in we can compile it into machine code instead of interpreting it and getting an even larger performance advantage.
Dj: I'd assume you'd be piling all of that into a super query optimizer. You'd have a chance to do an awful lot of optimization.
Avi: Yes, indeed. We really could utilize the CPU even better than we do now.
Dj: This is a more general question - from your point of view what's the most interesting thing happening in the database space at the moment?
Avi: Well from my very biased point of view I think Scylla is the most exciting thing. It's really bringing a different way of doing things so instead of using a managed language, which makes life maybe a little easier for the database implementer, we're using a native implementation, which means that we can bring much more robust implementation in terms of performance and latency variation.
We are also relying less and less on the operating system and doing things much more internally and again that brings better resource utilization and better performance because we can use special purpose algorithms in the database itself instead of using the general purpose algorithms that come with the operating system. This is mostly about managing IO, doing IO scheduling, and doing caching. Instead of relying on the operating system page cache we have a specialized cache in Scylla.
Dj: The reason I ask is is because right behind you on the whiteboard is a bunch of other database names, which looks very much like somebody was doing some competitive analysis.
Avi: Yeah, this is mainly a map to show the location of Scylla in the panorama of NoSQL databases. There is a huge variation in the feature sets and the ease of use and the different guarantees that all of those databases provide. One of the reasons that we picked Cassandra as our base was because Cassandra had such great features in terms of high availability and cross datacenter replication. It is really unmatched in the NoSQL data space in this area but the implementation was really lacking because it used the JVM and it was not optimized for performance. So we thought that we could combine the good features of Cassandra with a great implementation that provides good performance and really good latency.
Dj: And that's why Compose selected ScyllaDB as our latest database. Thanks Avi.
Avi: Thank you.