10.2.2 Creating a server-sharded connection decorator

  • Redis in Action – Home
  • Foreword
  • Preface
  • Part 1: Getting Started
  • Part 2: Core concepts
  • 1.3.1 Voting on articles
  • 1.3.2 Posting and fetching articles
  • 1.3.3 Grouping articles
  • 4.2.1 Configuring Redis for replication
  • 4.2.2 Redis replication startup process
  • 4.2.3 Master/slave chains
  • 4.2.4 Verifying disk writes
  • 5.1 Logging to Redis
  • 5.2 Counters and statistics
  • 5.3 IP-to-city and -country lookup
  • 5.4 Service discovery and configuration
  • 5.1.1 Recent logs
  • 5.1.2 Common logs
  • 5.2.2 Storing statistics in Redis
  • 5.3.1 Loading the location tables
  • 5.3.2 Looking up cities
  • 5.4.1 Using Redis to store configuration information
  • 5.4.2 One Redis server per application component
  • 5.4.3 Automatic Redis connection management
  • 8.1.1 User information
  • 8.1.2 Status messages
  • 9.1.1 The ziplist representation
  • 9.1.2 The intset encoding for SETs
  • Chapter 10: Scaling Redis
  • Chapter 11: Scripting Redis with Lua
  • 10.1 Scaling reads
  • 10.2 Scaling writes and memory capacity
  • 10.3 Scaling complex queries
  • 10.2.2 Creating a server-sharded connection decorator
  • 10.3.1 Scaling search query volume
  • 10.3.2 Scaling search index size
  • 10.3.3 Scaling a social network
  • 11.1.1 Loading Lua scripts into Redis
  • 11.1.2 Creating a new status message
  • 11.2 Rewriting locks and semaphores with Lua
  • 11.3 Doing away with WATCH/MULTI/EXEC
  • 11.4 Sharding LISTs with Lua
  • 11.5 Summary
  • 11.2.1 Why locks in Lua?
  • 11.2.2 Rewriting our lock
  • 11.2.3 Counting semaphores in Lua
  • 11.4.1 Structuring a sharded LIST
  • 11.4.2 Pushing items onto the sharded LIST
  • 11.4.4 Performing blocking pops from the sharded LIST
  • A.1 Installation on Debian or Ubuntu Linux
  • A.2 Installing on OS X
  • B.1 Forums for help
  • B.4 Data visualization and recording
  • Buy the paperback
  • Redis in Action – Home
  • Foreword
  • Preface
  • Part 1: Getting Started
  • Part 2: Core concepts
  • 1.3.1 Voting on articles
  • 1.3.2 Posting and fetching articles
  • 1.3.3 Grouping articles
  • 4.2.1 Configuring Redis for replication
  • 4.2.2 Redis replication startup process
  • 4.2.3 Master/slave chains
  • 4.2.4 Verifying disk writes
  • 5.1 Logging to Redis
  • 5.2 Counters and statistics
  • 5.3 IP-to-city and -country lookup
  • 5.4 Service discovery and configuration
  • 5.1.1 Recent logs
  • 5.1.2 Common logs
  • 5.2.2 Storing statistics in Redis
  • 5.3.1 Loading the location tables
  • 5.3.2 Looking up cities
  • 5.4.1 Using Redis to store configuration information
  • 5.4.2 One Redis server per application component
  • 5.4.3 Automatic Redis connection management
  • 8.1.1 User information
  • 8.1.2 Status messages
  • 9.1.1 The ziplist representation
  • 9.1.2 The intset encoding for SETs
  • Chapter 10: Scaling Redis
  • Chapter 11: Scripting Redis with Lua
  • 10.1 Scaling reads
  • 10.2 Scaling writes and memory capacity
  • 10.3 Scaling complex queries
  • 10.2.2 Creating a server-sharded connection decorator
  • 10.3.1 Scaling search query volume
  • 10.3.2 Scaling search index size
  • 10.3.3 Scaling a social network
  • 11.1.1 Loading Lua scripts into Redis
  • 11.1.2 Creating a new status message
  • 11.2 Rewriting locks and semaphores with Lua
  • 11.3 Doing away with WATCH/MULTI/EXEC
  • 11.4 Sharding LISTs with Lua
  • 11.5 Summary
  • 11.2.1 Why locks in Lua?
  • 11.2.2 Rewriting our lock
  • 11.2.3 Counting semaphores in Lua
  • 11.4.1 Structuring a sharded LIST
  • 11.4.2 Pushing items onto the sharded LIST
  • 11.4.4 Performing blocking pops from the sharded LIST
  • A.1 Installation on Debian or Ubuntu Linux
  • A.2 Installing on OS X
  • B.1 Forums for help
  • B.4 Data visualization and recording
  • Buy the paperback

    10.2.2 Creating a server-sharded connection decorator

    Now that we have a method to easily fetch a sharded connection, let’s use it to build a
    decorator to automatically pass a sharded connection to underlying functions.

    We’ll perform the same three-level function decoration we used in chapter 5,
    which will let us use the same kind of “component” passing we used there. In addition
    to component information, we’ll also pass the number of Redis servers we’re going to
    shard to. The following listing shows the details of our shard-aware connection decorator.

    Listing 10.3A shard-aware connection decorator
    def sharded_connection(component, shard_count, wait=1):
    

    Our decorator will take a component name, as well as the number of shards desired.

       def wrapper(function):
    

    We’ll then create a wrapper that will actually decorate the function.

          @functools.wraps(function)
    

    Copy some useful metadata from the original function to the configuration handler.

          def call(key, *args, **kwargs):
    

    Create the function that will calculate a shard ID for keys, and set up the connection manager.

             conn = get_sharded_connection(
                component, key, shard_count, wait)
    

    Fetch the sharded connection.

             return function(conn, key, *args, **kwargs)
    

    Actually call the function, passing the connection and existing arguments.

          return call
    

    Return the fully wrapped function.

       return wrapper
    

    Return a function that can wrap functions that need a sharded connection.

    Because of the way we constructed our connection decorator, we can decorate our
    count_visit() function from chapter 9 almost completely unchanged. We need to be
    careful because we’re keeping aggregate count information, which is fetched and/or
    updated by our get_expected() function. Because the information stored will be
    used and reused on different days for different users, we need to use a nonsharded
    connection for it. The updated and decorated count_visit() function as well as the
    decorated and slightly updated get_expected() function are shown next.

    Listing 10.4A machine and key-sharded count_visit() function
    @sharded_connection('unique', 16)
    

    We’ll shard this to 16 different machines, which will automatically shard to multiple keys on each machine.

    def count_visit(conn, session_id):
    
       today = date.today()
       key = 'unique:%s'%today.isoformat()
    
       conn2, expected = get_expected(key, today)
    

    Our changed call to get_expected().

       id = int(session_id.replace('-', '')[:15], 16)
       if shard_sadd(conn, key, id, expected, SHARD_SIZE):
    
          conn2.incr(key)
    

    Use the returned nonsharded connection to increment our unique counts.

    @redis_connection('unique')
    

    Use a nonsharded connection to get_expected().

    def get_expected(conn, key, today):
       'all of the same function body as before, except the last line'
    
       return conn, EXPECTED[key]
    

    Also return the nonsharded connection so that count_visit() can increment our unique count as necessary.

    In our example, we’re sharding our data out to 16 different machines for the unique
    visit SETs, whose configurations are stored as JSON-encoded strings at keys named
    config:redis:unique:0 to config:redis:unique:15. For our daily count information,
    we’re storing them in a nonsharded Redis server, whose configuration information
    is stored at key config:redis:unique.

    MULTIPLE REDIS SERVERS ON A SINGLE MACHINEThis section discusses sharding
    writes to multiple machines in order to increase total memory available
    and total write capacity. But if you’re feeling limited by Redis’s singlethreaded
    processing limit (maybe because you’re performing expensive
    searches, sorts, or other queries), and you have more cores available for processing,
    more network available for communication, and more available disk
    I/O for snapshots/AOF, you can run multiple Redis servers on a single
    machine. You only need to configure them to listen on different ports and
    ensure that they have different snapshot/AOF configurations.

    ALTERNATE METHODS OF HANDLING UNIQUE VISIT COUNTS OVER TIMEWith the
    use of SETBIT, BITCOUNT, and BITOP, you can actually scale unique visitor
    counts without sharding by using an indexed lookup of bits, similar to what
    we did with locations in chapter 9. A library that implements this in Python
    can be found at https://github.com/Doist/bitmapist.

    Now that we have functions to get regular and sharded connections, as well as decorators
    to automatically pass regular and sharded connections, using Redis connections
    of multiple types is significantly easier. Unfortunately, not all operations that we need
    to perform on sharded datasets are as easy as a unique visitor count. In the next section,
    we’ll talk about scaling search in two different ways, as well as how to scale our
    social network example.