Break the data matrix. Explore what Redis has to offer.
Sometimes we need to make multiple calls to Redis in order to manipulate multiple structures at the same time. Though there are a few commands to copy or move items between keys, there isn’t a single command to move items between types (though you can copy from a SET to a ZSET with ZUNIONSTORE). For operations involving multiple keys (of the same or different types), Redis has five commands that help us operate on multiple keys without interruption: WATCH, MULTI, EXEC, UNWATCH, and DISCARD.
For now, we’ll only talk about the simplest version of a Redis transaction, which uses MULTI and EXEC. If you want to see an example that uses WATCH, MULTI, EXEC, and UNWATCH, you can skip ahead to section 4.4, where I explain why you’d need to use WATCH and UNWATCH with MULTI and EXEC.
In Redis, a basic transaction involving MULTI and EXEC is meant to provide the opportunity for one client to execute multiple commands A, B, C, … without other clients being able to interrupt them. This isn’t the same as a relational database transaction, which can be executed partially, and then rolled back or committed. In Redis, every command passed as part of a basic MULTI/EXEC transaction is executed one after another until they’ve completed. After they’ve completed, other clients may execute their commands.
To perform a transaction in Redis, we first call MULTI, followed by any sequence of commands we intend to execute, followed by EXEC. When seeing MULTI, Redis will queue up commands from that same connection until it sees an EXEC, at which point Redis will execute the queued commands sequentially without interruption. Semantically, our Python library handles this by the use of what’s called a pipeline. Calling the pipeline() method on a connection object will create a transaction, which when used correctly will automatically wrap a sequence of commands with MULTI and EXEC. Incidentally, the Python Redis client will also store the commands to send until we actually want to send them. This reduces the number of round trips between Redis and the client, which can improve the performance of a sequence of commands.
As was the case with PUBLISH and SUBSCRIBE, the simplest way to demonstrate the result of using a transaction is through the use of threads. In the next listing, you can see the result of parallel increment operations without a transaction.
Without transactions, each of the three threads are able to increment the notrans: counter before the decrement comes through. We exaggerate potential issues here by including a 100ms sleep, but if we needed to be able to perform these two calls without other commands getting in the way, we’d have issues. The following listing shows these same operations with a transaction.
As you can see, by using a transaction, each thread is able to execute its entire sequence of commands without other threads interrupting it, despite the delay between the two calls. Again, this is because Redis waits to execute all of the provided commands between MULTI and EXEC until all of the commands have been received and followed by an EXEC.
There are both benefits and drawbacks to using transactions, which we’ll discuss further in section 4.4.
One of the primary purposes of MULTI/EXEC transactions is removing what are known as race conditions, which you saw exposed in listing 3.13. It turns out that the article_vote() function from chapter 1 has a race condition and a second related bug. The race condition can cause a memory leak, and the bug can cause a vote to not be counted correctly. The chances of either of them happening is very small, but can you spot and fix them? Hint: If you’re having difficulty finding the memory leak, check out section 6.2.5 while consulting the post_article() function.
A secondary purpose of using pipelines in Redis is to improve performance (we’ll talk more about this in sections 4.4–4.6). In particular, by reducing the number of round trips between Redis and our client that occur over a sequence of commands, we can significantly reduce the amount of time our client is waiting for a response. In the get_articles() function we defined in chapter 1, there will actually be 26 round trips between Redis and the client to fetch a full page of articles. This is a waste. Can you change get_articles() so that it only makes two round trips?
When writing data to Redis, sometimes the data is only going to be useful for a short period of time. We can manually delete this data after that time has elapsed, or we can have Redis automatically delete the data itself by using key expiration.