RedisDays Available Now On-Demand.

7.4.2 Approaching the problem like search

  • Redis in Action – Home
  • Foreword
  • Preface
  • Acknowledgments
  • About this Book
  • About the Cover Illustration
  • Part 1: Getting Started
  • Part 2: Core concepts
  • Part 3: Next steps
  • Appendix A
  • Appendix B
  • Buy the paperback
  • Redis in Action – Home
  • Foreword
  • Preface
  • Acknowledgments
  • About this Book
  • About the Cover Illustration
  • Part 1: Getting Started
  • Part 2: Core concepts
  • Part 3: Next steps
  • Appendix A
  • Appendix B
  • Buy the paperback

    7.4.2 Approaching the problem like search

    In section 7.3.3, we used SETs and ZSETs as holders for additive bonuses for optional targeting parameters. If we’re careful, we can do the same thing for groups of required targeting parameters.

    Rather than talk about jobs with skills, we need to flip the problem around like we did with the other search problems described in this chapter. We start with one SET per skill, which stores all of the jobs that require that skill. In a required skills ZSET, we store the total number of skills that a job requires. The code that SETs up our index looks like the next listing.

    Listing 7.18 A function for indexing jobs based on the required skills
    def index_job(conn, job_id, skills):
        pipeline = conn.pipeline(True)
        for skill in skills:
    
     
            pipeline.sadd('idx:skill:' + skill, job_id)
    

    Add the job ID to all appropriate skill SETs.

        pipeline.zadd('idx:jobs:req', job_id, len(set(skills)))
    

    Add the total required skill count to the required skills ZSET.

        pipeline.execute()
    
     

     

    This indexing function should remind you of the text indexing function we used in section 7.1. The only major difference is that we’re providing index_job() with pretokenized skills, and we’re adding a member to a ZSET that keeps a record of the number of skills that each job requires.

    To perform a search for jobs that a candidate has all of the skills for, we need to approach the search like we did with the bonuses to ad targeting in section 7.3.3. More specifically, we’ll perform a ZUNIONSTOREoperation over skill SETs to calculate a total score for each job. This score represents how many skills the candidate has for each of the jobs.

    Because we have a ZSET with the total number of skills required, we can then perform a ZINTERSTORE operation between the candidate’s ZSET and the required skills ZSET with weights -1 and 1, respectively. Any job ID with a score equal to 0 in that final result ZSET is a job that the candidate has all of the required skills for. The code for implementing the search operation is shown in the following listing.

    Listing 7.19 Find all jobs that a candidate is qualified for
    def find_jobs(conn, candidate_skills):
    
     
        skills = {}
        for skill in set(candidate_skills):
            skills['skill:' + skill] = 1
    
    

    Set up the dictionary for scoring the jobs.

        job_scores = zunion(conn, skills)
    

    Calculate the scores for each of the jobs.

        final_result = zintersect(
            conn, {job_scores:-1, 'jobs:req':1})
    
    

    Calculate how many more skills the job requires than the candidate has.

        return conn.zrangebyscore('idx:' + final_result, 0, 0)
    

    Return the jobs that the candidate has the skills for.

     

    Again, we first find the scores for each job. After we have the scores for each job, we subtract each job score from the total score necessary to match. In that final result, any job with a ZSET score of 0 is a job that the candidate has all of the skills for.

    Depending on the number of jobs and searches that are being performed, our job search system may or may not perform as fast as we need it to, especially with large numbers of jobs or searches. But if we apply sharding techniques that we’ll discuss in chapter 9, we can break the large calculations into smaller pieces and calculate partial results bit by bit. Alternatively, if we first find the SET of jobs in a location to search for jobs, we could perform the same kind of optimization that we performed with ad targeting in section 7.3.3, which could greatly improve job-search performance.

    Exercise: Levels of experience

    A natural extension to the simple required skills listing is an understanding that skill levels vary from beginner to intermediate, to expert, and beyond. Can you come up with a method using additional SETs to offer the ability, for example, for someone who has as intermediate level in a skill to find jobs that require either beginner or intermediate-level candidates?

     

    Exercise: Years of experience

    Levels of expertise can be useful, but another way to look at the amount of experience someone has is the number of years they’ve used it. Can you build an alternate version that supports handling arbitrary numbers of years of experience?