10.2 Scaling writes and memory capacity

  • Redis in Action – Home
  • Foreword
  • Preface
  • Part 1: Getting Started
  • Part 2: Core concepts
  • 1.3.1 Voting on articles
  • 1.3.2 Posting and fetching articles
  • 1.3.3 Grouping articles
  • 4.2.1 Configuring Redis for replication
  • 4.2.2 Redis replication startup process
  • 4.2.3 Master/slave chains
  • 4.2.4 Verifying disk writes
  • 5.1 Logging to Redis
  • 5.2 Counters and statistics
  • 5.3 IP-to-city and -country lookup
  • 5.4 Service discovery and configuration
  • 5.1.1 Recent logs
  • 5.1.2 Common logs
  • 5.2.2 Storing statistics in Redis
  • 5.3.1 Loading the location tables
  • 5.3.2 Looking up cities
  • 5.4.1 Using Redis to store configuration information
  • 5.4.2 One Redis server per application component
  • 5.4.3 Automatic Redis connection management
  • 8.1.1 User information
  • 8.1.2 Status messages
  • 9.1.1 The ziplist representation
  • 9.1.2 The intset encoding for SETs
  • Chapter 10: Scaling Redis
  • Chapter 11: Scripting Redis with Lua
  • 10.1 Scaling reads
  • 10.2 Scaling writes and memory capacity
  • 10.3 Scaling complex queries
  • 10.2.2 Creating a server-sharded connection decorator
  • 10.3.1 Scaling search query volume
  • 10.3.2 Scaling search index size
  • 10.3.3 Scaling a social network
  • 11.1.1 Loading Lua scripts into Redis
  • 11.1.2 Creating a new status message
  • 11.2 Rewriting locks and semaphores with Lua
  • 11.3 Doing away with WATCH/MULTI/EXEC
  • 11.4 Sharding LISTs with Lua
  • 11.5 Summary
  • 11.2.1 Why locks in Lua?
  • 11.2.2 Rewriting our lock
  • 11.2.3 Counting semaphores in Lua
  • 11.4.1 Structuring a sharded LIST
  • 11.4.2 Pushing items onto the sharded LIST
  • 11.4.4 Performing blocking pops from the sharded LIST
  • A.1 Installation on Debian or Ubuntu Linux
  • A.2 Installing on OS X
  • B.1 Forums for help
  • B.4 Data visualization and recording
  • Buy the paperback
  • Redis in Action – Home
  • Foreword
  • Preface
  • Part 1: Getting Started
  • Part 2: Core concepts
  • 1.3.1 Voting on articles
  • 1.3.2 Posting and fetching articles
  • 1.3.3 Grouping articles
  • 4.2.1 Configuring Redis for replication
  • 4.2.2 Redis replication startup process
  • 4.2.3 Master/slave chains
  • 4.2.4 Verifying disk writes
  • 5.1 Logging to Redis
  • 5.2 Counters and statistics
  • 5.3 IP-to-city and -country lookup
  • 5.4 Service discovery and configuration
  • 5.1.1 Recent logs
  • 5.1.2 Common logs
  • 5.2.2 Storing statistics in Redis
  • 5.3.1 Loading the location tables
  • 5.3.2 Looking up cities
  • 5.4.1 Using Redis to store configuration information
  • 5.4.2 One Redis server per application component
  • 5.4.3 Automatic Redis connection management
  • 8.1.1 User information
  • 8.1.2 Status messages
  • 9.1.1 The ziplist representation
  • 9.1.2 The intset encoding for SETs
  • Chapter 10: Scaling Redis
  • Chapter 11: Scripting Redis with Lua
  • 10.1 Scaling reads
  • 10.2 Scaling writes and memory capacity
  • 10.3 Scaling complex queries
  • 10.2.2 Creating a server-sharded connection decorator
  • 10.3.1 Scaling search query volume
  • 10.3.2 Scaling search index size
  • 10.3.3 Scaling a social network
  • 11.1.1 Loading Lua scripts into Redis
  • 11.1.2 Creating a new status message
  • 11.2 Rewriting locks and semaphores with Lua
  • 11.3 Doing away with WATCH/MULTI/EXEC
  • 11.4 Sharding LISTs with Lua
  • 11.5 Summary
  • 11.2.1 Why locks in Lua?
  • 11.2.2 Rewriting our lock
  • 11.2.3 Counting semaphores in Lua
  • 11.4.1 Structuring a sharded LIST
  • 11.4.2 Pushing items onto the sharded LIST
  • 11.4.4 Performing blocking pops from the sharded LIST
  • A.1 Installation on Debian or Ubuntu Linux
  • A.2 Installing on OS X
  • B.1 Forums for help
  • B.4 Data visualization and recording
  • Buy the paperback

    10.2 Scaling writes and memory capacity

    Back in chapter 2, we built a system that could automatically cache rendered web
    pages inside Redis. Fortunately for us, it helped reduce page load time and web page
    processing overhead. Unfortunately, we’ve come to a point where we’ve scaled our
    cache up to the largest single machine we can afford, and must now split our data
    among a group of smaller machines.

    SCALING WRITE VOLUMEThough we discuss sharding in the context of
    increasing our total available memory, these methods also work to increase
    write throughput if we’ve reached the limit of performance that a single
    machine can sustain.

    In this section, we’ll discuss a method to scale memory and write throughput with
    sharding, using techniques similar to those we used in chapter 9.

    To ensure that we really need to scale our write capacity, we should first make sure
    we’re doing what we can to reduce memory and how much data we’re writing:

    • Make sure that we’ve checked all of our methods to reduce read data volume
      first.
    • Make sure that we’ve moved larger pieces of unrelated functionality to different
      servers (if we’re using our connection decorators from chapter 5 already, this
      should be easy).
    • If possible, try to aggregate writes in local memory before writing to Redis, as we
      discussed in chapter 6 (which applies to almost all analytics and statistics calculation
      methods).
    • If we’re running into limitations with WATCH/MULTI/EXEC, consider using locks
      as we discussed in chapter 6 (or consider using Lua, as we’ll talk about in chapter
      11).
    • If we’re using AOF persistence, remember that our disk needs to keep up with
      the volume of data we’re writing (400,000 small commands may only be a few
      megabytes per second, but 100,000 x 1 KB writes is 100 megabytes per second).

    Now that we’ve done everything we can to reduce memory use, maximize performance,
    and understand the limitations of what a single machine can do, it’s time to
    actually shard our data to multiple machines. The methods that we use to shard our
    data to multiple machines rely on the number of Redis servers used being more or
    less fixed. If we can estimate that our write volume will, for example, increase 4 times
    every 6 months, we can preshard our data into 256 shards. By presharding into 256
    shards, we’d have a plan that should be sufficient for the next 2 years of expected
    growth (how far out to plan ahead for is up to you).

    PRESHARDING FOR GROWTHWhen presharding your system in order to prepare
    for growth, you may be in a situation where you have too little data to
    make it worth running as many machines as you could need later. To still be
    able to separate your data, you can run multiple Redis servers on a single machine for each of your shards, or you can use multiple Redis databases
    inside a single Redis server. From this starting point, you can move to multiple
    machines through the use of replication and configuration management
    (see section 10.2.1). If you’re running multiple Redis servers on a single
    machine, remember to have them listen on different ports, and make sure
    that all servers write to different snapshot files and/or append-only files.

    The first thing that we need to do is to talk about how we’ll define our shard
    configuration.