10.1 Scaling reads

  • Redis in Action – Home
  • Foreword
  • Preface
  • Acknowledgments
  • About this Book
  • About the Cover Illustration
  • Part 1: Getting Started
  • Part 2: Core concepts
  • Part 3: Next steps
  • Appendix A
  • Appendix B
  • Buy the paperback
  • Redis in Action – Home
  • Foreword
  • Preface
  • Acknowledgments
  • About this Book
  • About the Cover Illustration
  • Part 1: Getting Started
  • Part 2: Core concepts
  • Part 3: Next steps
  • Appendix A
  • Appendix B
  • Buy the paperback

    10.1 Scaling reads

    In chapter 8 we built a social network that offered many of the same features and
    functionalities of Twitter. One of these features was the ability for users to view their
    home timeline as well as their profile timeline. When viewing these timelines, we’ll be fetching 30 posts at a time. For a small social network, this wouldn’t be a serious issue,
    since we could still support anywhere from 3,000–10,000 users fetching timelines
    every second (if that was all that Redis was doing). But for a larger social network, it
    wouldn’t be unexpected to need to serve many times that number of timeline fetches
    every second.

    In this section, we’ll discuss the use of read slaves to support scaling read queries
    beyond what a single Redis server can handle.

    Before we get into scaling reads, let’s first review a few opportunities for improving
    performance before we must resort to using additional servers with slaves to scale our

    • If we’re using small structures (as we discussed in chapter 9), first make sure
      that our max ziplist size isn’t too large to cause performance penalties.
    • Remember to use structures that offer good performance for the types of queries
      we want to perform (don’t treat LISTs like SETs; don’t fetch an entire HASH
      just to sort on the client—use a ZSET; and so on).
    • If we’re sending large objects to Redis for caching, consider compressing the
      data to reduce network bandwidth for reads and writes (compare lz4, gzip, and
      bzip2 to determine which offers the best trade-offs for size/performance for
      our uses).
    • Remember to use pipelining (with or without transactions, depending on our
      requirements) and connection pooling, as we discussed in chapter 4.

    When we’re doing everything that we can to ensure that reads and writes are fast, it’s
    time to address the need to perform more read queries. The simplest method to
    increase total read throughput available to Redis is to add read-only slave servers. If
    you remember from chapter 4, we can run additional servers that connect to a master,
    receive a replica of the master’s data, and be kept up to date in real time (more or
    less, depending on network bandwidth). By running our read queries against one of
    several slaves, we can gain additional read query capacity with every new slave.

    REMEMBER: WRITE TO THE MASTERWhen using read slaves, and generally
    when using slaves at all, you must remember to write to the master Redis
    server only. By default, attempting to write to a Redis server configured as a
    slave (even if it’s also a master) will cause that server to reply with an error.
    We’ll talk about using a configuration option to allow for writes to slave servers
    in section 10.3.1, but generally you should run with slave writes disabled;
    writing to slaves is usually an error.

    Chapter 4 has all the details on configuring Redis for replication to slave servers, how
    it works, and some ideas for scaling to many read slaves. Briefly, we can update the
    Redis configuration file with a line that reads slaveof host port, replacing host and
    port with the host and port of the master server. We can also configure a slave by running
    the SLAVEOF host port command against an existing server. Remember: When a
    slave connects to a master, any data that previously existed on the slave will be discarded. To disconnect a slave from a master to stop it from slaving, we can run
    SLAVEOF no one.

    One of the biggest issues that arises when using multiple Redis slaves to serve data
    is what happens when a master temporarily or permanently goes down. Remember that
    when a slave connects, the Redis master initiates a snapshot. If multiple slaves connect
    before the snapshot completes, they’ll all receive the same snapshot. Though this is
    great from an efficiency perspective (no need to create multiple snapshots), sending
    multiple copies of a snapshot at the same time to multiple slaves can use up the majority
    of the network bandwidth available to the server. This could cause high latency to/from
    the master, and could cause previously connected slaves to become disconnected.

    One method of addressing the slave resync issue is to reduce the total data volume
    that’ll be sent between the master and its slaves. This can be done by setting up intermediate
    replication servers to form a type of tree, as can be seen in figure 10.1, which
    we borrow from chapter 4.

    These slave trees will work, and can be necessary if we’re looking to replicate to a
    different data center (resyncing across a slower WAN link will take resources, which
    should be pushed off to an intermediate slave instead of running against the root master).
    But slave trees suffer from having a complex network topology that makes manually
    or automatically handling failover situations difficult.

    An alternative to building slave trees is to use compression across our network links
    to reduce the volume of data that needs to be transferred. Some users have found that
    using SSH to tunnel a connection with compression dropped bandwidth use significantly.
    One company went from using 21 megabits of network bandwidth for replicating
    to a single slave to about 1.8 megabits (http://mng.bz/2ivv). If you use this
    method, you’ll want to use a mechanism that automatically reconnects a disconnected
    SSH connection, of which there are several options to choose from.

    Figure 10.1An example Redis master/slave replica tree with nine lowest-level slaves and three intermediate replication helper servers

    ENCRYPTION AND COMPRESSION OVERHEADGenerally, encryption overhead for
    SSH tunnels shouldn’t be a huge burden on your server, since AES-128 can
    encrypt around 180 megabytes per second on a single core of a 2.6 GHz Intel
    Core 2 processor, and RC4 can encrypt about 350 megabytes per second on
    the same machine. Assuming that you have a gigabit network link, roughly
    one moderately powerful core can max out the connection with encryption.
    Compression is where you may run into issues, because SSH compression
    defaults to gzip. At compression level 1 (you can configure SSH to use a specific
    compression level; check the man pages for SSH), our trusty 2.6 GHz processor
    can compress roughly 24–52 megabytes per second of a few different
    types of Redis dumps (the initial sync), and 60–80 megabytes per second of a
    few different types of append-only files (streaming replication). Remember
    that, though higher compression levels may compress more, they’ll also use
    more processor, which may be an issue for high-throughput low-processor
    machines. Generally, I’d recommend using compression levels below 5 if possible,
    since 5 still provides a 10–20% reduction in total data size over level 1,
    for roughly 2–3 times as much processing time. Compression level 9 typically
    takes 5–10 times the time for level 1, for compression only 1–5% better than
    level 5 (I stick to level 1 for any reasonably fast network connection).

    USING COMPRESSION WITH OPENVPNAt first glance, OpenVPN’s support for
    AES encryption and compression using lzo may seem like a great turnkey solution
    to offering transparent reconnections with compression and encryption
    (as opposed to using one of the third-party SSH reconnecting scripts). Unfortunately,
    most of the information that I’ve been able to find has suggested
    that performance improvements when enabling lzo compression in OpenVPN
    are limited to roughly 25–30% on 10 megabit connections, and effectively
    zero improvement on faster connections.

    One recent addition to the list of Redis tools that can be used to help with replication
    and failover is known as Redis Sentinel. Redis Sentinel is a mode of the Redis server
    binary where it doesn’t act as a typical Redis server. Instead, Sentinel watches the
    behavior and health of a number of masters and their slaves. By using PUBLISH/SUBSCRIBE
    against the masters combined with PING calls to slaves and masters, a collection
    of Sentinel processes independently discover information about available slaves
    and other Sentinels. Upon master failure, a single Sentinel will be chosen based on
    information that all Sentinels have and will choose a new master server from the existing
    slaves. After that slave is turned into a master, the Sentinel will move the slaves
    over to the new master (by default one at a time, but this can be configured to a
    higher number).

    Generally, the Redis Sentinel service is intended to offer automated failover from
    a master to one of its slaves. It offers options for notification of failover, calling userprovided
    scripts to update configurations, and more.

    Now that we’ve made an attempt to scale our read capacity, it’s time to look at how
    we may be able to scale our write capacity as well.