Trends in Serverless Databases

Rani Kubersky

May 12, 2023

This post was originally published in Polyglot’s monthly newsletter on May 5th, 2023. Subscribe here and never miss an issue!

Despite the many databases available on the market, developers still confront a host of challenges when interacting with them like issues with data capacity or cost management. If a developer provisions databases for too many workloads, they overspend and if they underestimate capacity needs, they risk failure. As a result, engineering teams are often forced to estimate demand for their applications and monitor their database infrastructure manually.

Serverless* databases solve for this by autoscaling. Because serverless databases scale automatically and elastically, developers don’t need to monitor the database load or capacity, which saves them time. Serverless databases also abandon the model of paying for pre-provisioned clusters and instead, leverage consumption-based pricing. With serverless, hobbyists and companies only pay for the resources that they use.

Suddenly, it seems like every database offering has gone “serverless.” But, these offerings are not all positioned in the same way (nor are they identical). AWS’ serverless version of Aurora was one of the first serverless Postgres offerings. Cockroach Labs’ (a Work-Bench Portco) serverless offering is positioned as a Postgres multi-region serverless platform with a rights scalable system that doesn't force users to manually shard.

Emerging players like Neon and Xata also offer serverless databases. Neon is open source and modeled on Aurora’s architecture, while Xata caters to a broader audience with its Airtable-like UI.

‍*Serverless = A misnomer. There are still servers. The server management is just abstracted away. The concept of serverless can apply to databases. Serverless databases are architected to autoscale and therefore leverage usage-based pricing. Learn more about serverless databases from MongoDB.

Trends in serverless databases:

Separation of Storage and Compute: Bringing Snowflake architecture to relational database management systems presents an advantage because it limits costs and allows for scaling speed. CockroachDB, Neon, Aurora, SurrealDB, and AlloyDB all separate storage and compute to enable serverless. In doing so, end users can reduce and scale storage and compute independent of one another. Data that’s rarely accessed can be offloaded to S3 and compute clusters can be allocated as needed for loads or queries. Most importantly, the data storage can shrink as the end user removes data. Similarly, the compute nodes can be scaled to zero. Alternative architectures limit elasticity and require advanced planning from end users like predicting peak capacity and data demand. This can get expensive. According to Nikita Shamgunov of Neon,
“Separation of storage and compute is an architectural advantage. The reason it is one is that in the cloud, if you run any stateful technology, you are by design overprovisioning storage or compute, or network bandwidth or metadata services. Storage and compute are usually what you end up paying for. If they’re tied together, you will have all sorts of problems when you run your technology as a service. In a stateful system, you have to reconfigure a system that creates a management overhead every time you run out of storage.” [paraphrased]
‍
The promise of separating storage and compute is that you only pay for what you use. When they’re tied together, the end user can’t take advantage of the price of storage (which tends to be cheaper than compute).
‍
Why does it matter? Put simply, this architecture is novel because it’s different from Postgres out of the box.
‍
Postgres: Speaking of Postgres, developers love it right now. Serverless database companies that leverage Postgres include Neon, AWS Aurora, and Xata. For hobby projects or early stage companies, an on-demand system like Neon is cheaper than paying for a Postgres instance.
‍
Why does it matter? Postgres is on the cusp of overtaking MySQL because most developer tooling interoperates with Postgres (Vercel, Supabase, etc.).
‍
Developer Experience Extends to Serverless Databases: Serverless by definition augments developer experience by reducing the mental burden of server management. But, there’s more to it. Most critically, serverless databases have partnered with companies like Vercel that are top of mind in the developer consciousness. On May 1, Vercel announced a strategic partnership with Neon that enables developers to integrate serverless Postgres databases into their Vercel applications. PlanetScale offers integrations for Vercel and Netlify.
‍
Why does it matter? The JAMstack* is increasingly serverless. We will continue to see partnerships between JAMstack players like Vercel and serverless offerings that aim to promote a unified or “full-stack” development experience. According to the Jamstack Community Survey 2022, “We mentioned above that there was a big shift in the last year of people describing themselves as ‘full stack’ developers from ‘front end’ developers. We think the big jump in serverless adoption may be the explanation: serverless lets front-end developers build full-stack applications with a minimum of fuss, and the adoption has been so fast it’s changing how we describe ourselves.”
‍
*JAMstack = a web development architecture that combines client-side JavaScript, reusable APIs, and prebuilt Markup.

Serverless DBs: Market Overview

Within Online Transaction Processing (OLTP) databases, select serverless relational databases include: Neon, PlanetScale, Cockroach Labs, and Xata. Serverless non-relational databases include: MongoDB, FaunaDB, and SurrealDB. The public cloud vendors have their own serverless database products like AWS Aurora, Google Cloud’s AlloyDB, and Azure CosmosDB.

Why Not Implement Serverless: The v2 Problem

There are missing economies of scale in the serverless pricing model. v2 of serverless may be more expensive than v2 with traditional architecture. Serverless is great for v1 players like hobby projects or startups in the pre-scaling, pre product-market fit phases. But when a hobby project or business starts getting consistent traction, costs may soar. Lee Robinson from Vercel talks about the tradeoff between serverless’ usage-based model and risk of failure in its absence (his Tweet might have also alluded to Vercel’s markup on Neon and Upstash). In short, a serverless database is a great fit for a business that may go viral, but not in a consistent way.

Resources in Serverless Databases

Here are several resources that informed these trends. I’d recommend checking them out.

2023 State of Databases for Serverless and Edge by Lee Robinson, Vercel
“More databases are embracing serverless, but what “serverless” means to them varies. There are different vectors of autoscaling: connections, storage, compute, and more.”

Open Source Startup Podcast: Building Scalable Postgres with Serverless Database Platform Neon with Nikita Shamgunov, Amanda Robson, and Timothy Chen
“If you never run out of storage, then you most likely overprovisioned storage…then you’re paying for not what you’re not using. Clouds are very expensive now…The only way to mitigate the expense of the cloud is to never pay for what you’re not using.”

Building a database in the 2020s by Ed Huang
“Serverless, as many people think of it, is a technical term, but I think it's not. Serverless is more about defining what a better product on the cloud from a user experience perspective is. Or maybe that's the way it should be: why should user[s] care about how many nodes you have? Why do I need to care about your database's internal configuration? Why do I have to wait for another half hour after I click launch?

If you’re building in serverless or generally thinking about this space, I’d love to chat with you. You can find me on Twitter @ranikubersky. Many thanks to Kelley Mak and Priyanka Somrah for their contributions to this piece.

TOPICS

Research

Meet NYC’s Future SaaS Founders @ cofounders.nyc