Join us at RedisDays Atlanta

9.2 Sharded structures

  • Redis in Action – Home
  • Foreword
  • Preface
  • Part 1: Getting Started
  • Part 2: Core concepts
  • 1.3.1 Voting on articles
  • 1.3.2 Posting and fetching articles
  • 1.3.3 Grouping articles
  • 4.2.1 Configuring Redis for replication
  • 4.2.2 Redis replication startup process
  • 4.2.3 Master/slave chains
  • 4.2.4 Verifying disk writes
  • 5.1 Logging to Redis
  • 5.2 Counters and statistics
  • 5.3 IP-to-city and -country lookup
  • 5.4 Service discovery and configuration
  • 5.1.1 Recent logs
  • 5.1.2 Common logs
  • 5.2.2 Storing statistics in Redis
  • 5.3.1 Loading the location tables
  • 5.3.2 Looking up cities
  • 5.4.1 Using Redis to store configuration information
  • 5.4.2 One Redis server per application component
  • 5.4.3 Automatic Redis connection management
  • 8.1.1 User information
  • 8.1.2 Status messages
  • 9.1.1 The ziplist representation
  • 9.1.2 The intset encoding for SETs
  • Chapter 11: Scripting Redis with Lua
  • 11.1.1 Loading Lua scripts into Redis
  • 11.1.2 Creating a new status message
  • 11.2 Rewriting locks and semaphores with Lua
  • 11.3 Doing away with WATCH/MULTI/EXEC
  • 11.4 Sharding LISTs with Lua
  • 11.5 Summary
  • 11.2.1 Why locks in Lua?
  • 11.2.2 Rewriting our lock
  • 11.2.3 Counting semaphores in Lua
  • 11.4.1 Structuring a sharded LIST
  • 11.4.2 Pushing items onto the sharded LIST
  • 11.4.4 Performing blocking pops from the sharded LIST
  • A.1 Installation on Debian or Ubuntu Linux
  • A.2 Installing on OS X
  • B.1 Forums for help
  • B.4 Data visualization and recording
  • Buy the paperback
  • Redis in Action – Home
  • Foreword
  • Preface
  • Part 1: Getting Started
  • Part 2: Core concepts
  • 1.3.1 Voting on articles
  • 1.3.2 Posting and fetching articles
  • 1.3.3 Grouping articles
  • 4.2.1 Configuring Redis for replication
  • 4.2.2 Redis replication startup process
  • 4.2.3 Master/slave chains
  • 4.2.4 Verifying disk writes
  • 5.1 Logging to Redis
  • 5.2 Counters and statistics
  • 5.3 IP-to-city and -country lookup
  • 5.4 Service discovery and configuration
  • 5.1.1 Recent logs
  • 5.1.2 Common logs
  • 5.2.2 Storing statistics in Redis
  • 5.3.1 Loading the location tables
  • 5.3.2 Looking up cities
  • 5.4.1 Using Redis to store configuration information
  • 5.4.2 One Redis server per application component
  • 5.4.3 Automatic Redis connection management
  • 8.1.1 User information
  • 8.1.2 Status messages
  • 9.1.1 The ziplist representation
  • 9.1.2 The intset encoding for SETs
  • Chapter 11: Scripting Redis with Lua
  • 11.1.1 Loading Lua scripts into Redis
  • 11.1.2 Creating a new status message
  • 11.2 Rewriting locks and semaphores with Lua
  • 11.3 Doing away with WATCH/MULTI/EXEC
  • 11.4 Sharding LISTs with Lua
  • 11.5 Summary
  • 11.2.1 Why locks in Lua?
  • 11.2.2 Rewriting our lock
  • 11.2.3 Counting semaphores in Lua
  • 11.4.1 Structuring a sharded LIST
  • 11.4.2 Pushing items onto the sharded LIST
  • 11.4.4 Performing blocking pops from the sharded LIST
  • A.1 Installation on Debian or Ubuntu Linux
  • A.2 Installing on OS X
  • B.1 Forums for help
  • B.4 Data visualization and recording
  • Buy the paperback

    9.2 Sharded structures

    Sharding is a well-known technique that has been used to help many different databases scale to larger data storage and processing loads. Basically, sharding takes your data, partitions it into smaller pieces based on some simple rules, and then sends the data to different locations depending on which partition the data had been assigned to.

    In this section, we’ll talk about applying the concept of sharding to HASHes, SETs, and ZSETs to support a subset of their standard functionality, while still letting us use the small structures from section 9.1 to reduce memory use. Generally, instead of storing value X in key Y, we’ll store X in key Y:<shardid>.

    SHARDING LISTSSharding LISTs without the use of Lua scripting is difficult, which is why we omit it here. When we introduce scripting with Lua in chapter 11, we’ll build a sharded LIST implementation that supports blocking and nonblocking pushes and pops from both ends.

    SHARDING ZSETSUnlike sharded HASHes and SETs, where essentially all operations can be supported with a moderate amount of work (or even LISTs with Lua scripting), commands like ZRANGE, ZRANGEBYSCORE, ZRANK, ZCOUNT, ZREMRANGE, ZREMRANGEBYSCORE, and more require operating on all of the shards of a ZSET to calculate their final result. Because these operations on sharded ZSETs violate almost all of the expectations about how quickly a ZSET should perform with those operations, sharding a ZSET isn’t necessarily that useful, which is why we essentially omit it here.

    If you need to keep full information for a large ZSET, but you only really perform queries against the top- or bottom-scoring X, you can shard your ZSET in the same way we shard HASHes in section 9.2.1: keeping auxiliary top/bottom scoring ZSETs, which you can update with ZADD/ZREMRANGEBYRANK to keep limited (as we’ve done previously in chapters 2 and 4–8).

    You could also use sharded ZSETs as a way of reducing single-command latencies if you have large search indexes, though discovering the final highestand lowest-scoring items would take a potentially long series of ZUNIONSTORE/ZREMRANGEBYRANK pairs.

    When sharding structures, we can make a decision to either support all of the functionality of a single structure or only a subset of the standard functionality. For the sake of simplicity, when we shard structures in this book, we’ll only implement a subset of the functionality offered by the standard structures, because to implement the full functionality can be overwhelming (from both computational and code-volume perspectives). Even though we only implement a subset of the functionality, we’ll use these sharded structures to offer memory reductions to existing problems, or to solve new problems more efficiently than would otherwise be possible.

    The first structure we’ll talk about sharding is the HASH.