2.3 Web page caching

  • Redis in Action – Home
  • Foreword
  • Preface
  • Part 1: Getting Started
  • Part 2: Core concepts
  • 1.3.1 Voting on articles
  • 1.3.2 Posting and fetching articles
  • 1.3.3 Grouping articles
  • 4.2.1 Configuring Redis for replication
  • 4.2.2 Redis replication startup process
  • 4.2.3 Master/slave chains
  • 4.2.4 Verifying disk writes
  • 5.1 Logging to Redis
  • 5.2 Counters and statistics
  • 5.3 IP-to-city and -country lookup
  • 5.4 Service discovery and configuration
  • 5.1.1 Recent logs
  • 5.1.2 Common logs
  • 5.2.2 Storing statistics in Redis
  • 5.3.1 Loading the location tables
  • 5.3.2 Looking up cities
  • 5.4.1 Using Redis to store configuration information
  • 5.4.2 One Redis server per application component
  • 5.4.3 Automatic Redis connection management
  • 8.1.1 User information
  • 8.1.2 Status messages
  • 9.1.1 The ziplist representation
  • 9.1.2 The intset encoding for SETs
  • Chapter 10: Scaling Redis
  • Chapter 11: Scripting Redis with Lua
  • 10.1 Scaling reads
  • 10.2 Scaling writes and memory capacity
  • 10.3 Scaling complex queries
  • 10.2.2 Creating a server-sharded connection decorator
  • 10.3.1 Scaling search query volume
  • 10.3.2 Scaling search index size
  • 10.3.3 Scaling a social network
  • 11.1.1 Loading Lua scripts into Redis
  • 11.1.2 Creating a new status message
  • 11.2 Rewriting locks and semaphores with Lua
  • 11.3 Doing away with WATCH/MULTI/EXEC
  • 11.4 Sharding LISTs with Lua
  • 11.5 Summary
  • 11.2.1 Why locks in Lua?
  • 11.2.2 Rewriting our lock
  • 11.2.3 Counting semaphores in Lua
  • 11.4.1 Structuring a sharded LIST
  • 11.4.2 Pushing items onto the sharded LIST
  • 11.4.4 Performing blocking pops from the sharded LIST
  • A.1 Installation on Debian or Ubuntu Linux
  • A.2 Installing on OS X
  • B.1 Forums for help
  • B.4 Data visualization and recording
  • Buy the paperback
  • Redis in Action – Home
  • Foreword
  • Preface
  • Part 1: Getting Started
  • Part 2: Core concepts
  • 1.3.1 Voting on articles
  • 1.3.2 Posting and fetching articles
  • 1.3.3 Grouping articles
  • 4.2.1 Configuring Redis for replication
  • 4.2.2 Redis replication startup process
  • 4.2.3 Master/slave chains
  • 4.2.4 Verifying disk writes
  • 5.1 Logging to Redis
  • 5.2 Counters and statistics
  • 5.3 IP-to-city and -country lookup
  • 5.4 Service discovery and configuration
  • 5.1.1 Recent logs
  • 5.1.2 Common logs
  • 5.2.2 Storing statistics in Redis
  • 5.3.1 Loading the location tables
  • 5.3.2 Looking up cities
  • 5.4.1 Using Redis to store configuration information
  • 5.4.2 One Redis server per application component
  • 5.4.3 Automatic Redis connection management
  • 8.1.1 User information
  • 8.1.2 Status messages
  • 9.1.1 The ziplist representation
  • 9.1.2 The intset encoding for SETs
  • Chapter 10: Scaling Redis
  • Chapter 11: Scripting Redis with Lua
  • 10.1 Scaling reads
  • 10.2 Scaling writes and memory capacity
  • 10.3 Scaling complex queries
  • 10.2.2 Creating a server-sharded connection decorator
  • 10.3.1 Scaling search query volume
  • 10.3.2 Scaling search index size
  • 10.3.3 Scaling a social network
  • 11.1.1 Loading Lua scripts into Redis
  • 11.1.2 Creating a new status message
  • 11.2 Rewriting locks and semaphores with Lua
  • 11.3 Doing away with WATCH/MULTI/EXEC
  • 11.4 Sharding LISTs with Lua
  • 11.5 Summary
  • 11.2.1 Why locks in Lua?
  • 11.2.2 Rewriting our lock
  • 11.2.3 Counting semaphores in Lua
  • 11.4.1 Structuring a sharded LIST
  • 11.4.2 Pushing items onto the sharded LIST
  • 11.4.4 Performing blocking pops from the sharded LIST
  • A.1 Installation on Debian or Ubuntu Linux
  • A.2 Installing on OS X
  • B.1 Forums for help
  • B.4 Data visualization and recording
  • Buy the paperback

    2.3 Web page caching

    When producing web pages dynamically, it’s common to use a templating language to simplify page generation. Gone are the days when each page would be written by hand. Modern web pages are generated from page templates with headers, footers, side menus, toolbars, content areas, and maybe even generated JavaScript.

    Despite being capable of dynamically generating content, the majority of pages that are served on Fake Web Retailer’s website don’t change much on a regular basis. Sure, some new items are added to the catalog, old items are removed, sometimes there are specials, and sometimes there are even “hot items” pages. But really, only a handful of account settings, past orders, shopping cart/checkout, and similar pages have content that needs to be generated on every page load.

    By looking at their view numbers, Fake Web Retailer has determined that 95% of the web pages that they serve change at most once per day, and don’t actually require content to be dynamically generated. It’s our job to stop generating 95% of pages for every load. By reducing the amount of time we spend generating static content, we can reduce the number of servers necessary to handle the same load, and we can serve our site faster. (Research has shown that reducing the time users spend waiting for pages to load increases their desire to use a site and improves how they rate the site.)

    All of the standard Python application frameworks offer the ability to add layers that can pre- or post-process requests as they’re handled. These layers are typically called middleware or plugins. Let’s create one of these layers that calls out to our Redis caching function. If a web request can’t be cached, we’ll generate the page and return the content. If a request can be cached, we’ll try to fetch and return the page from the cache; otherwise we’ll generate the page, cache the result in Redis for up to 5 minutes, and return the content. Our simple caching method can be seen in the next listing.

    Listing 2.6 The cache_request() function
    def cache_request(conn, request, callback):
    
    	if not can_cache(conn, request):
    		return callback(request)
    

    If we can’t cache the request, immediately call the callback.

    
    
    	page_key = 'cache:' + hash_request(request)
    

    Convert the request into a simple string key for later lookups.

    	content = conn.get(page_key)
    

    Fetch the cached content if we can, and it’s available.

    	if not content:
    
    		content = callback(request)
    

    Generate the content if we can’t cache the page, or if it wasn’t cached.

    		conn.setex(page_key, content, 300)
    

    Cache the newly generated content if we can cache it.

    
    
    	return content
    

    Return the content.

    For that 95% of content that could be cached and is loaded often, this bit of code removes the need to dynamically generate viewed pages for 5 minutes. Depending on the complexity of content, this one change could reduce the latency for a data-heavy page from maybe 20–50ms, down to one round trip to Redis (under 1ms for a local connection, under 5ms for computers close to each other in the same data center). For pages that used to hit the database for data, this can further reduce page load time and database load.

    Now that we’ve cut loading time for pages that don’t change often, can we keep using Redis to cut loading time for pages that do change often? Of course we can! Keep reading to find out how.