RedisDays Available Now On-Demand.
After we have our packed location codes, we only need to store them in STRINGs with SETRANGE. But before we do so, we have to think for a moment about how many users we’re going to be storing information about. For example, suppose that Twitter has 750 million users today (based on the observation that recently created users have IDs greater than 750 million); we’d need over 1.5 gigabytes of space to store location information for all Twitter users. Though most operating systems are able to reasonably allocate large regions of memory, Redis limits us to 512 megabyte STRINGs, and due to Redis’s clearing out of data when setting a value beyond the end of an existing STRING, setting the first value at the end of a long STRING will take more time than would be expected for a simple SETBIT call. Instead, we can use a technique similar to what we used in section 9.2.1, and shard our data across a collection of STRINGs.
Unlike when we were sharding HASHes and SETs, we don’t have to worry about being efficient by keeping our shards smaller than a few thousand elements, since we can access an element directly without decoding any others. Similarly, we can write to a given offset efficiently. Instead, our concern is more along the lines of being efficient at a larger scale—specifically what will balance potential memory fragmentation, as well as minimize the number of keys that are necessary. For this example, we’ll store location information for 220 users (just over 1 million entries) per STRING, which will use about 2 megabytes per STRING. In the next listing, we see the code for updating location information for a user.
For the most part, there shouldn’t be anything surprising there. We calculate the location code to store for a user, calculate the shard and the individual shard offset for the user, and then store the location code in the proper location for the user. The only thing that’s strange and may not seem necessary is that we also update a ZSET that stores the highest-numbered user ID that has been seen. This is primarily important when calculating aggregates over everyone we have information about (so we know when to stop).
By continuing to use this site, you consent to our updated privacy agreement. You can change your cookie settings at any time but parts of our site will not function correctly without them.