We are now, simply, Redis

Learn More

e-Book - Redis in Action

This book covers the use of Redis, an in-memory database/data structure server

  • Redis in Action – Home
  • Foreword
  • Preface
  • Acknowledgments
  • About this Book
  • About the Cover Illustration
  • Part 1: Getting Started
  • Part 2: Core concepts
  • Part 3: Next steps
  • Appendix A
  • Appendix B
  • Buy the paperback
  • Redis in Action – Home
  • Foreword
  • Preface
  • Acknowledgments
  • About this Book
  • About the Cover Illustration
  • Part 1: Getting Started
  • Part 2: Core concepts
  • Part 3: Next steps
  • Appendix A
  • Appendix B
  • Buy the paperback

    6.3.4 Preventing race conditions

    As you saw when building locks in section 6.2, dealing with race conditions that cause
    retries or data corruption can be difficult. In this case, the semaphores that we created
    have race conditions that we alluded to earlier, which can cause incorrect operation.

    We can see the problem in the following example. If we have two processes A and
    B that are trying to get one remaining semaphore, and A increments the counter first
    but B adds its identifier to the ZSETs and checks its identifier’s rank first, then B will
    get the semaphore. When A then adds its identifier and checks its rank, it’ll “steal” the
    semaphore from B, but B won’t know until it tries to release or renew the semaphore.

    When we were using the system clock as a way of getting a lock, the likelihood of
    this kind of a race condition coming up and resulting in more than the desired number
    of semaphore owners was related to the difference in system times—the greater
    the difference, the greater the likelihood. After introducing the counter with the
    owner ZSET, this problem became less likely (just by virtue of removing the system
    clock as a variable), but because we have multiple round trips, it’s still possible.

    To fully handle all possible race conditions for semaphores in Redis, we need to
    reuse the earlier distributed lock with timeouts that we built in section 6.2.5. We need
    to use our earlier lock to help build a correct counting semaphore. Overall, to acquire
    the semaphore, we’ll first try to acquire the lock for the semaphore with a short timeout.
    If we got the lock, we then perform our normal semaphore acquire operations
    with the counter, owner ZSET, and the system time ZSET. If we failed to acquire the
    lock, then we say that we also failed to acquire the semaphore. The code for performing
    this operation is shown next.

    Listing 6.17The acquire_semaphore_with_lock() function
    def acquire_semaphore_with_lock(conn, semname, limit, timeout=10):
       identifier = acquire_lock(conn, semname, acquire_timeout=.01)
       if identifier:
             return acquire_fair_semaphore(conn, semname, limit, timeout)
          release_lock(conn, semname, identifier)

    I know, it can be disappointing to come so far only to end up needing to use a lock at
    the end. But that’s the thing with Redis: there are usually a few ways to solve the same
    or a similar problem, each with different trade-offs. Here are some of the trade-offs
    for the different counting semaphores that we’ve built:

    • If you’re happy with using the system clock, never need to refresh the semaphore,
      and are okay with occasionally going over the limit, then you can use the
      first semaphore we created.
    • If you can only really trust system clocks to be within 1 or 2 seconds, but are still
      okay with occasionally going over your semaphore limit, then you can use the
      second one.
    • If you need your semaphores to be correct every single time, then you can use a
      lock to guarantee correctness.

    Now that we’ve used our lock from section 6.2 to help us fix the race condition, we
    have varying options for how strict we want to be with our semaphore limits. Generally it’s a good idea to stick with the last, strictest version. Not only is the last semaphore
    actually correct, but whatever time we may save using a simpler semaphore, we could lose by using too many resources.

    In this section, we used semaphores to limit the number of API calls that can be running
    at any one time. Another common use case is to limit concurrent requests to databases
    to reduce individual query times and to prevent dogpiling like we talked about at
    the end of section 6.2.4. One other common situation is when we’re trying to download
    many web pages from a server, but their robots.txt says that we can only make (for example)
    three requests at a time. If we have many clients downloading web pages, we can
    use a semaphore to ensure that we aren’t pushing a given server too hard.

    As we finish with building locks and semaphores to help improve performance for
    concurrent execution, it’s now time to talk about using them in more situations. In
    the next section, we’ll build two different types of task queues for delayed and concurrent
    task execution.