RedisDays Available Now On-Demand.

5.3.1 Loading the location tables

  • Redis in Action – Home
  • Foreword
  • Preface
  • Acknowledgments
  • About this Book
  • About the Cover Illustration
  • Part 1: Getting Started
  • Part 2: Core concepts
  • Part 3: Next steps
  • Appendix A
  • Appendix B
  • Buy the paperback
  • Redis in Action – Home
  • Foreword
  • Preface
  • Acknowledgments
  • About this Book
  • About the Cover Illustration
  • Part 1: Getting Started
  • Part 2: Core concepts
  • Part 3: Next steps
  • Appendix A
  • Appendix B
  • Buy the paperback

    5.3.1 Loading the location tables

    For development data, I’ve downloaded a free IP-to-city database available from http://dev.maxmind.com/geoip/geolite. This database contains two important files: Geo- LiteCity-Blocks.csv, which contains information about ranges of IP addresses and city IDs for those ranges, and GeoLiteCity-Location.csv, which contains a mapping of city IDs to the city name, the name of the region/state/province, the name of the country, and some other information that we won’t use.

    We’ll first construct the lookup table that allows us to take an IP address and convert it to a city ID. We’ll then construct a second lookup table that allows us to take the city ID and convert it to actual city information (city information will also include region and country information).

    The table that allows us to find an IP address and turn it into a city ID will be constructed from a single ZSET, which has a special city ID as the member, and an integer value of the IP address as the score. To allow us to map from IP address to city ID, we convert dotted-quad format IP addresses to an integer score by taking each octet as a byte in an unsigned 32-bit integer, with the first octet being the highest bits. Code to perform this operation can be seen here.

    Listing 5.9 The ip_to_score() function
    def ip_to_score(ip_address):
       score = 0
       for v in ip_address.split('.'):
          score = score * 256 + int(v, 10)
       return score
    

     

     

     

    After we have the score, we’ll add the IP address mapping to city IDs first. To construct a unique city ID from each normal city ID (because multiple IP address ranges can map to the same city ID), we’ll append a _ character followed by the number of entries we’ve added to the ZSET already, as can be seen in the next listing.

    Listing 5.10 The import_ips_to_redis() function
    def import_ips_to_redis(conn, filename):
    

    Should be run with the location of the GeoLiteCity-Blocks.csv file.

     

     

       csv_file = csv.reader(open(filename, 'rb'))
       for count, row in enumerate(csv_file):
    

     

     

          start_ip = row[0] if row else ''
          if 'i' in start_ip.lower():
             continue
          if '.' in start_ip:
             start_ip = ip_to_score(start_ip)
          elif start_ip.isdigit():
             start_ip = int(start_ip, 10)
    

    Convert the IP address to a score as necessary.

     

     

          else:
    

     

     

             continue
    
    

    Header row or malformed entry.

     

     

          city_id = row[2] + '_' + str(count)
    

    Construct the unique city ID.

     

     

          conn.zadd('ip2cityid:', city_id, start_ip)
    

    Add the IP address score and city ID.

     

     

     

    When our IP addresses have all been loaded by calling import_ips_to_redis(), we’ll create a ZSET that maps city IDs to city information, as shown in the next listing. We’ll store the city information as a list encoded with JSON, because all of our entries are of a fixed format that won’t be changing over time.

    Listing 5.11 The import_cities_to_redis() function
    def import_cities_to_redis(conn, filename):
    

    Should be run with the location of the GeoLiteCity-Location.csv file.

     

     

       for row in csv.reader(open(filename, 'rb')):
          if len(row) < 4 or not row[0].isdigit():
             continue
    

     

     

          row = [i.decode('latin-1') for i in row]
    

     

     

          city_id = row[0]
          country = row[1]
          region = row[2]
          city = row[3]
    

    Prepare the information for adding to the ZSET.

     

     

          conn.hset('cityid2city:', city_id,
             json.dumps([city, region, country]))
    

    Actually add the city information to Redis.

     

     

     

    Now that we have all of our information in Redis, we can start looking up IP addresses.