Techniques for Scaling Applications with a Database

Scaling is one of the hardest tasks to accomplish with a database. This article explains how scaling works, under the hood.

Feb 22nd, 2023 2:54am by Lee Atchison

Featued image for: Techniques for Scaling Applications with a Database

This is the second article exploring the capabilities of Redis. Part I shared the basic data types and models to get started with Redis.

Applications grow. As an application attracts more users, so do the databases that store the information created, whether that’s sales transactions or scientific data gathered. Larger datasets require more resources to store and process that data. Plus, with more simultaneous users using the system, the database needs more resources.

When your application becomes popular, it needs to scale to meet the demand. Nobody sticks around if an application is slow — not willingly, anyway.

If you need to scale, celebrate it as a good problem! But that doesn’t make the process simple. Scaling has multiple possible options, each requiring different levels of sophistication. Here, we cover scaling both as a generic challenge and specifically for Redis databases, with attention to advanced scaling using Redis Enterprise.

Scaling Concepts

Scaling is a multidimensional problem with several distinct solutions.

Vertical scaling involves increasing a database’s resources. Typically, this involves moving the database to a more powerful computer or to a larger instance type.

“More” is the key word. As with any hardware choice, you consider more powerful processors, more memory and/or more network bandwidth. You have to find a balance between them that optimally improves the database’s performance and the number of simultaneous users it can support, not to mention optimizing your hardware budget.

One element in vertical scaling is adjusting the amount of RAM available to the database. In the case of Redis, RAM limits the amount of data that the database can store, so it’s an important consideration.

Vertical scaling is colloquially called scaling up or scaling down, depending on whether you move up to a more powerful computer or (in rare circumstances) shift down to a less powerful computer.

Horizontal scaling involves adding additional computer nodes to a cluster of instances that operate the database, without changing the size or capacity of any individual node. Horizontal scaling is also called scaling out (when you add nodes) or scaling in (when you decrease the number of nodes).

Depending on how it’s implemented, horizontal scaling can also improve the database’s overall reliability. It eliminates a single point of failure because you are increasing the number of nodes that can be used in failover situations. However, horizontal scaling also increases time and effort (and thus costs), because you need more nodes (and hence more failure points) to keep the database functional.

In other words, vertical scaling increases the size and computing power of a single instance or node, while horizontal scaling changes the number of nodes or instances.

Vertical scaling is an easy way to improve database performance, assuming that you have or can acquire a larger computer or instance. It typically can be implemented easily in the cloud, with no impact on the application or database architecture.

Complexity isn’t a bad thing when it’s the right choice, as long as you know what you are doing.

When done correctly, horizontal scaling gives your database and your application significantly more room to grow. This scheme has plenty of history in responding to performance bottlenecks: Just throw more hardware at it!

However, horizontal scaling typically is harder to implement than vertical scaling. Adding additional nodes means more complexity. Are those nodes read-only nodes? Read/write master nodes? Active masters? Passive masters? The complexity of your database and your application architecture can increase dramatically.

Complexity isn’t a bad thing when it’s the right choice, though, as long as you know what you are doing.

There are several ways to implement horizontal scaling, each with a distinct set of advantages and disadvantages. Selecting the right model is important in building a data storage architecture. Redis supports many horizontal scaling options. Some are available in Redis open source (OSS), and some are available only in Redis Enterprise.

The Basics of Sharding

Sharding is a technique for improving a database’s overall performance, as well as increasing its storage and resource limits. It’s a relatively simple horizontal scaling technique.

With sharding, data is distributed across various partitions, or nodes. Each node holds only a portion of the data stored in the entire database. In Redis’ case, a key/value input is processed, and the data is stored in a shard.

When a request is made to the database, it is sent to a shard selector, which chooses the appropriate shard to send the request. In Redis, shard selection is often implemented by a proxy that looks at the key for the requested data, and based on the key, the proxy sends the request to the appropriate shard instance.

The shard selection algorithm is deterministic, which means every request for a given key always goes to the same shard. Only that shard has information for a given data key, as illustrated by Figure 1.

Figure 1. Horizontal scaling via sharding

Sharding is a relatively easy way to scale out a database’s capacity. By adding three shards to a Redis OSS implementation, for instance, you can nearly triple the database’s performance and triple the storage limits.

But sharding isn’t always simple. Choosing a shard selector that effectively balances traffic across all nodes may require tuning. Sharding can also lower application availability because it increases dependency on multiple instances. If you don’t manage it properly, failure of a single instance can bring down the entire database. That’ll cause a bad day at work.

Redis clustering addresses these issues, and also makes sharding easier to implement. If resharding is necessary to rebalance a database for reasons of storage capacity or performance, the data is physically moved to a new node.

An application’s awareness of the shard selector algorithm can allow the application to perform better overall balancing across the shards, though at the cost of increased complexity.

Sharding effectiveness is only as good as the shard selector algorithm that is used. An application’s awareness of the shard selector algorithm can allow the application to perform better overall balancing across the shards, though at the cost of increased complexity.

Clustering in Redis OSS is led by the cluster with the client library being cluster-aware. Essentially, the shard selector is implemented in the client library. This requires client-side support of the clustering protocol.

In Redis Enterprise, a server-side proxy is used to implement the shard selector and to provide support for clustering server side. The proxy acts as a load balancer of sorts between the horizontally scaled Redis instances.

Clustering is a common solution to horizontal scaling, but it has pros and cons. On the plus side, sharding is an effective way to quickly scale an application, and it is used in many large, highly scaled applications. Also, it is available out of the box.

On the other hand, clustering requires additional management. You need to know what you’re doing. Individual, large keys can create imbalances that are difficult or impossible to compensate for.

Redis clustering eliminates much of shardings’ complexities. It allows applications to focus on the data management aspects of scaling a large dataset more effectively. It improves both write and read performance.

Ultimately, how well it all works depends on the access patterns that the application uses.

Read Replicas

Another horizontal scaling option is read replicas. As the name suggests, the emphasis of read replicas is to improve the performance of reading data without regard to the time spent writing data to the database. The premise is that it is far more common to retrieve data than to change it or to add new data.

In a simple database, data is stored on a single server, and both read and write access to the data occurs on that server. With read replicas, a copy of the server’s data is stored on auxiliary servers, called read replicas. Whenever data is updated, the replicas receive updates from the primary server.

Each auxiliary server has a complete copy of the database. So when an application makes a read request, that request can go to any of the read replica servers. That means a significantly greater number of read requests can be handled simultaneously, which improves scalability and overall performance.

Read replicas cannot improve write performance, but they can increase read performance significantly.

But read replicas have limitations, depending on several factors, such as the consistency model that the database uses, or the network latency you need to contend with.

Read replicas cannot improve write performance, but they can increase read performance significantly. However, that does require you to consider how write-intensive your application is. It takes some time for a database write to the master database to propagate to the read replicas. This delay, called skew, can result in older data being returned to the application while the primary server updates the replica servers. The delay is only for a short period of time, but sometimes those delays are critical. This may or may not be an issue for your own situation, but take note of the issue as you design your system.

Think about the process of writing to a database.

When you update information or add new data, the write is performed to the master database instance only. That’s sacrosanct; all writes must go to the one master database instance.
This master instance then sends a message to all of the read replicas, indicating what data has changed in the database, and enabling the read replicas to update their copies of the data to match the master copy.

Since all database writes go through the master instance, there is no write performance improvement when additional read replicas are added. In fact, there can be a minor decrease in write performance when you add a new read replica. That’s because the master now has an additional node it must notify when a write occurs. Typically, this impact is not significant, but it’s certainly not a zero impact.

Consider the illustration in Figure 2, which shows a Redis implementation consisting of three servers. All writes to the Redis database are made to the single master database. This single master sends updates of the changed data to all the replicas. Each replica contains a complete copy of the stored Redis database.

Then, when the application wants to retrieve data, the read access to the Redis instance can occur on any server in the cluster. A load balancer takes care of routing the individual read requests, which directs traffic using one of a number of load balancing algorithms. (There are several load balancing algorithms, including round robin, least used, etc., but they are outside the scope of this discussion.)

Figure 2. Horizontal scalability with read replicas

Another benefit to using read replicas is in availability improvement. If a read replica crashes, the load balancer simply redirects traffic to another read replica. If the write master crashes, you can promote one of the read replicas to the role of master, so the system can stay operational.

Read replicas are an easy-to-implement model for horizontal scalability, and the method improves availability with little or no application impact.

Active-Active

Active-Active replication or Active-Active clustering is a way to improve performance for higher database loads.

As with read replicas, Active-Active (also called multimaster replication) relies on a database cluster with multiple nodes, with a copy of the database stored on all the nodes and a load balancer distributing the load.

With Active-Active replication, however, both read and write requests are distributed across multiple servers, and load balanced among all the nodes. The performance boost is meaningful, because a significantly larger number of requests can be handled, and they are handled faster.

Note that Active-Active replication is not supported directly by Redis OSS. If this turns out to be the appropriate scaling architecture for your needs, you’ll need Redis Enterprise. But the focus here is in explaining the computer science technique, no matter where you get it from (including building it yourself, if you have that sort of time).

With Active-Active replication, the read propagation happens exactly as described in the previous section.

When an application writes to one node, this database write is propagated to every master in the system. There are many ways this can occur, such as:

The application can force the write to all masters.
A write proxy can distribute the writes.
The master node that receives the write call can forward the request to other non-receiving master servers.

Figure 3 illustrates a database implementation with a cluster consisting of three servers. Each server contains a complete copy of all the data. Any server can handle any type of data request — read or write — for any data in the database.

Figure 3. Active-Active replication

When the load balancer directs a write request to the database master instance, such as in this example, it sends the update to all the other replicated instances. If a write is sent to any of the other nodes, that node sends the update to all the other replicated instances in a similar manner.

But what happens when two requests are sent to update the same data value?

In a single-node database, the requests are serialized and the changes take place in order, with the last change typically overriding previous changes.

In the Active-Active model, though, the two requests could come to different masters. The masters could then send conflicting update messages to the other master servers. This is called a write conflict.

In the case of a write conflict, the application needs to determine which database-write to keep and which to reject. That requires a resolution algorithm of some type, involving application logic or database rules.

Additionally, since updates are sent to each node asynchronously, it’s possible for data lag to cause one node to go slightly out of sync with another node. That’s an issue even when that mismatch is only for a short period of time. Developers have to take care that the application considers this potential lag so that it does not affect operations. This is similar to the issues with read replicas, but is potentially more complex.

Besides improving performance, this model of horizontal scalability also increases overall database availability. If a single node fails, the other nodes can take up the slack. However, since each node contains a complete copy of the data, this has no impact on the storage limit of a database by adding additional servers.

The cost of this model is increased application complexity in dealing with conflicting data.

Redis Enterprise’s Active-Active Geo-Distribution

Redis OSS does not natively support multimaster redundancy in any form.

However, Redis Enterprise does Active-Active Geo-Distribution, which provides Active-Active multimaster redundancy.

Then Redis Enterprise’s Active-Active Geo-Distribution goes one step further. It enables individual clusters to be located in geographically distributed locations yet replicate data between them. Take a look at Figure 4.

Figure 4. Active-Active replication across geography

This allows a Redis database to be geographically distributed to support software instances running in different geographic locations.

In this model, multiple master database instances are in different data centers. Those can be located across different regions and around the world. Individual consumers, via the application, connect to the Redis database instance that is nearest to their geographic location. The Active-Active Redis database instances are then synchronized in a multimaster model so that each Redis instance always has a complete and up-to-date copy of the data.

Redis Enterprise’s Active-Active Geo-Distribution has sophisticated algorithms for effectively dealing with write conflicts, including implementing conflict-free, replicated data types (CRDTs) that guarantee strong eventual consistency and make the process of replication synchronization significantly more reliable. The application still must be aware of and deal with data lag and write conflicts, so these issues don’t become a problem.

What’s Right for You?

You need to make your applications run faster and support additional burden on their databases. Fortunately, as this article demonstrates, you have plenty of options for scaling techniques. Each has a different impact on the amount of storage space available to your application and on the system resources.

The technique you ultimately choose depends on many factors, including your company’s goals, your software requirements, the skills of the people in your IT department, your application architecture and how much complexity you’re willing to take on.

To learn more about how Redis Enterprise scales databases — with more diagrams, which are always fun — consult “Linear Scaling with Redis Enterprise.”

Lee Atchison is a software architect, author, public speaker, and recognized thought leader on cloud computing and application modernization. His most recent book is “Architecting for Scale” (O’Reilly Media), and he also is author of “The Definitive Guide to Caching...