Join us at RedisDays Atlanta

Redis Best Practices

Lexicographical Encoding

Sorted Sets (ZSETs), beyond ranking by score, have an interesting property that can be leveraged to create a roughly alphabetic sorting mechanism. This property is that members of the same score can be returned in lexicographic order with bounds. Let’s take the following data:

> ZADD animal-list 0 bison 0 boa 0 dog 0 emu 0 falcon 0 alligator 0 chipmunk

The command above adds some animals to a key animal-list. Each member always the same score (0). Running the ZRANGE command with arguments 0 (representing the first item)  and -1 (which represents the “last” member by score), we’ll see a curious order:

> ZRANGE animal-list 0 -1
1) "alligator"
2) "bison"
3) "boa"
4) "chipmunk"
5) "dog"
6) "emu"
7) "falcon"

Despite the items being entered in a non-alphabetical order, the items are returned properly sorted alphabetically. This sort order is actually a result of string comparisons being done on a binary-safe, byte-by-byte basis. This means that ASCII characters will be returned in proper English alphabetical sorting order. This has a few of implications:

  • Lower and upper cases will not be understood as the same letter
  • Multi-byte characters will not be sorted in expected ways

Redis also provides some advanced features to further narrow lexicographic searches. As an example, let’s say we want to return the animals starting with ‘b’ and ending with ‘e’. We can use the this command:

> ZRANGEBYLEX animal-list [b (f
1) "bison"
2) "boa"
3) "chipmunk"
4) "dog"
5) "emu"

The argument (f might be slightly surprising. This is an important point—because Redis has no notion of actually understanding alphabetic letters, this means that we have to consider that anything starting with an e will always be sorted before anything starting with an f, regardless of subsequent letters. The other items of note are that a square bracket represents inclusive search while the paren represents exclusive search. In our case, if we had a member of b, it would be included in the list, while f would not given this the above command. If you wanted to get all items past a certain point you would use the hex encoding of the last character (255 or 0xFF):

> ZRANGEBYLEX animal-list [c "[xff"
1) "chipmunk"
2) "dog"
3) "emu"
4) "falcon"

This same command can also be limited to provide a paging effect. Take for example the following commands and responses:

> ZRANGEBYLEX animal-list [b (f LIMIT 0 2
1) "bison"
2) "boa"
> ZRANGEBYLEX animal-list [b (f LIMIT 2 2
1) "chipmunk"
2) "dog"

The only caveat is that time complexity will continue to grow based on the offset (the first argument after LIMIT). So, if you have 1 million members and you’re trying to get the last two, as example, this requires traversal of of the all 1 million items.