Questions about the "Netflix watch list" personalization demo

Hi everyone -

I am reading Algolia’s personalization doc here. I’m excited by the mention of the “Netflix” use cases where you can “prioritize movies in my watch list”.

I’ve also looked at the demo data and code here.

The demo approach includes a “watch_list” field for each movie; this field includes a list of all user ids who have that movie on their watch list. Then, the hits widget filters the results on the client side via the transformData property (here).

Questions:

  1. Won’t this approach quickly run into the 10k record limit as the number of users grows, e.g. if 50k or whatever users have the same movie on their watch list?
  2. Why is the watch_list filtering done post-search on the client via the widgets rather than on the server?
  3. Given the filtering is done on the client, how does this work with pagination? E.g. if the search returns the first 100 results, only 20 of which are on the user’s watch list, then now we have the first “page” of only 20 results, not 100.
  4. Whenever a user adds a movie to their watch list, that means you have to add their user id to that movie’s watch_list field – is that slow?

–Ien

Hey there,

  1. We recently improved the engine on that front, and the limit of 10kb doesn’t exist anymore.
    The only limit that stays is the size limit of 20kb max per record on Essential.

  2. You specify the watch_list search parameter on the client-side, but the ranking is applied directly on the engine.

  3. The boost to items that match the watch_list is performed on the engine side, so it’ll apply on the full list of matching results, not only the hitsPerPage results.

  4. Yes, you need to send an indexing job. Indexing time can vary depending on the index, but they usually take a few seconds.

I hope this answers your question,

Please let me know if I can help you with anything else!

1 Like