Building a mostly correct lock in Redis is easy. Building a completely correct lock in Redis isn’t much more difficult, but requires being extra careful about the operations we use to build it. In this first version, we’re not going to handle the case where a lock times out, or the case where the holder of the lock crashes and doesn’t release the lock. Don’t worry; we’ll get to those cases in the next section, but for now, let’s just get basic locking correct.
The first part of making sure that no other code can run is to acquire the lock. The natural building block to use for acquiring a lock is the SETNX command, which will only set a value if the key doesn’t already exist. We’ll set the value to be a unique identifier to ensure that no other process can get the lock, and the unique identifier we’ll use is a 128-bit randomly generated UUID.
If we fail to acquire the lock initially, we’ll retry until we acquire the lock, or until a specified timeout has passed, whichever comes first, as shown here.
def acquire_lock(conn, lockname, acquire_timeout=10):
identifier = str(uuid.uuid4())
A 128-bit random identifier.
end = time.time() + acquire_timeout while time.time() < end:
if conn.setnx('lock:' + lockname, identifier):
Get the lock.
return identifier time.sleep return False
As described, we’ll attempt to acquire the lock by using SETNX to set the value of the lock’s key only if it doesn’t already exist. On failure, we’ll continue to attempt this until we’ve run out of time (which defaults to 10 seconds).
Now that we have the lock, we can perform our buying or selling without WATCH errors getting in our way. We’ll acquire the lock and, just like before, check the price of the item, make sure that the buyer has enough money, and if so, transfer the money and item. When completed, we release the lock. The code for this can be seen next.
def purchase_item_with_lock(conn, buyerid, itemid, sellerid): buyer = "users:%s"%buyerid seller = "users:%s"%sellerid item = "%s.%s"%(itemid, sellerid) inventory = "inventory:%s"%buyerid end = time.time() + 30
locked = acquire_lock(conn, market)
Get the lock.
return False pipe = conn.pipeline(True) try: while time.time() < end: try: pipe.watch(buyer)
pipe.zscore("market:", item) pipe.hget(buyer, 'funds') price, funds = pipe.execute() if price is None or price > funds: pipe.unwatch() return None
Check for a sold item or insufficient funds.
pipe.hincrby(seller, int(price)) pipe.hincrby(buyerid, int(-price)) pipe.sadd(inventory, itemid) pipe.zrem("market:", item) pipe.execute()
Transfer funds from the buyer to the seller, and transfer the item to the buyer.
return True
except redis.exceptions.WatchError: pass finally:
release_lock(conn, market, locked)
Release the lock.
Looking through the code listing, it almost seems like we’re locking the operation. But don’t be fooled—we’re locking the market data, and the lock must exist while we’re operating on the data, which is why it surrounds the code performing the operation. To release the lock, we have to be at least as careful as when acquiring the lock.
Between the time when we acquired the lock and when we’re trying to release it, someone may have done bad things to the lock. To release the lock, we need to WATCH the lock key, and then check to make sure that the value is still the same as what we set it to before we delete it. This also prevents us from releasing a lock multiple times. The release_lock() function is shown next.
def release_lock(conn, lockname, identifier): pipe = conn.pipeline(True) lockname = 'lock:' + lockname while True: try:
pipe.watch(lockname) if pipe.get(lockname) == identifier:
Check and verify that we still have the lock.
pipe.multi() pipe.delete(lockname) pipe.execute() return True
Release the lock.
pipe.unwatch() break
*
except redis.exceptions.WatchError: pass
Someone else did something with the lock; retry.
return False
We lost the lock.
We take many of the same steps to ensure that our lock hasn’t changed as we did with our money transfer in the first place. But if you think about our release lock function for long enough, you’ll (reasonably) come to the conclusion that, except in very rare situations, we don’t need to repeatedly loop. But the next version of the acquire lock function that supports timeouts, if accidentally mixed with earlier versions (also unlikely, but anything is possible with code), could cause the release lock transaction to fail and could leave the lock in the acquired state for longer than necessary. So, just to be extra careful, and to guarantee correctness in as many situations as possible, we’ll err on the side of caution.
After we’ve wrapped our calls with locks, we can perform the same simulation of buying and selling as we did before. In table 6.2, we have new rows that use the lockbased buying and selling code, which are shown below our earlier rows.
Though we generally have lower total number of items that finished being listed, we never retry, and our number of listed items compared to our number of purchased
Listed items |
Bought items |
Purchase retries |
Average wait per purchase |
|
---|---|---|---|---|
1 lister, 1 buyer, no lock |
145,000 |
27,000 |
80,000 |
14ms |
1 lister, 1 buyer, with lock |
51,000 |
50,000 |
0 |
1ms |
5 listers, 1 buyer, no lock |
331,000 |
<200 |
50,000 |
150ms |
5 listers, 1 buyer, with lock |
68,000 |
13,000 |
<10 |
5ms |
5 listers, 5 buyers, no lock |
206,000 |
<600 |
161,000 |
498ms |
5 listers, 5 buyers, with lock |
21,000 |
20,500 |
0 |
14ms |
items is close to the ratio of number of listers to buyers. At this point, we’re running at the limit of contention between the different listing and buying processes.
1 Having tested a few available Redis lock implementations that include support for timeouts, I was able to induce lock duplication on at least half of the lock implementations with just five clients acquiring and releasing the same lock over 10 seconds.