At AgilData, we have many years of experience running production MySQL infrastructures at scale. We run a 64 server sharded MySQL cluster for Pokémon GO for example. One component of our AgilData Scalable Cluster infrastructure is a MySQL proxy server, written in Java, that intercepts queries and executes them against a sharded database cluster. We go beyond the simple single-server routing provided by most sharding solutions and implement full distributed transactions and federated queries across the shards, allowing aggregate queries to be used and minimizing application changes.
While Java and Scala most definitely can work at scale, it is inevitable that developers end up working around the JVM platform to some degree to reduce the cost of garbage collection. Apache Spark is a great example of this. Some brilliant engineering has gone into Spark to make it run on the JVM while not really using the JVM – from dynamic byte code generation to custom off-heap memory management.
We’re currently working on a new product that requires a MySQL proxy and this time we decided to build it in Rust and release it as open source. Our timing for this project turned out to be quite fortuitous, thanks to the recent release of two key crates: tokio-rs and futures-rs that provide a great foundation for performing scalable asynchronous io.
Proxy Overview
The overall design of the proxy is pretty simple. We start up a server that binds to a socket and listens for incoming connections. For each incoming connection we spin up a Future to service requests from that socket. We pass the Future to the tokio-core reactor which controls the event loop. This code sample demonstrates the use of future combinators to chain together a number of futures. The call to TcpStream::connect() on line 5 (to connect to the MySQL server) does not actually connect right away but returns a future of that connection. Rather than having to wait here we can chain this future together with some more future operations using the and_then() call. The input to each and_then() function is the output from the previous future.