Filter Results By Friends List

Hi there,
I have a quick question that i would think is a pretty common use case, however, i can’t find any resources on it…

How do we filter results based on a ‘friends list’? E.g. I have a system of posts, and i want to list the posts from a specific user’s friends. How is that best done using Algolia?

More specifically, we want to create a similar algorithm to Facebook’s feed, filtering by distance/network/time.

Hello,

What you’re looking for here are the “disjunctive facets” (here is how to do it with our JavaScript library). Facets are simply another word for “filters”, and “disjunctive” more or less means “or” (all post posted by user X OR user Y OR user Z)

Let’s say you have a friend list as an array of user_id: [4, 8, 15, 16, 23, 42]. You want to get all the posts that are posted by any of this friends. You need to first define the user_id of the post record as an attributeForFacetting to be able to filter on it. Then you need to configure the Helper to tell it “get me all the post that are posted either by user:4, user:8, etc”.

In JavaScript Helper code, this would be something like:

helper.addDisjunctiveFacetRefinement('user_id', 4);
helper.addDisjunctiveFacetRefinement('user_id', 8);
helper.addDisjunctiveFacetRefinement('user_id', 15);
...

Hope that helps :slight_smile:

1 Like

Beautiful, I’m assuming that this will get slower the more ‘refinements’? We allow up to 500 friends on the service.

Luckily this is for serving up a feed, so we don’t need to run as “real time” like when doing a search.

It might get slower, but it is more related to how many records you have in your index, than the number of filters you pass. In any case, we do have an internal timeout in our engine and if a request is taking more than this timeout (I think it’s 50ms), then we return a truncated list of results.

Results are always returned by relevance order, so even if we truncate the list, you’ll still get the more relevant first (and you’ll have a flag in the response JSON telling you it’s not exhaustive)

I just want to add a bit more information on my latest answer. The more refinements you’ll add, the slower the query will get, you are right. It’s hard to estimate what the threshold will be as it really depends on the number of records you have and some other factors, but it’s safe to say that if you have ~10 refinements, you will see no perf decrease. If you go to 100 you might notice some delay and 500 might get you truncated results.

There is another way to do what you’re trying to achieve, though. It is much more scalable, but will require more code and more operations on your plan. Algolia indices are schemaless, and there is no built-in way to do relationships between items like you would do in a relational database. It means that if we want to do a “get all the posts that belongs to my friends” kind of query, we need to be clever.

What I would suggest is to add a viewable_by attribute to each of your posts. This attribute would contain an array of all the user_ids that can see the post. It means, all the people that are friends with the poster.

Now, your front-end code is much simpler, you just have to add a facet on viewable_by with a value equal to the current user_id.

On the other hand, it also means that whenever someone adds/removes a friend, you’ll have to update all the records to add/remove their user_id from viewable_by. This might create many operations and might not fit in your current plan.

It also means that anyone could tinker with the js code and replace their user_id with the user_id of someone else and see other’s publications. This can be fixed by using our secured API keys, which are API keys with a set of filters already baked in. You would then create a new secured api-key for each user, that will allow them to only search into their own viewable_by content.

Those are the two possible approaches: a quick one that works for a small dataset, and a more powerful one that will scale for bigger datasets, but will require more code and a bigger plan. Your choice :slight_smile:

1 Like

Really valuable thread! Would be nice to see a tutorial made for this use case.

Just wanted to jump in and let others know that the second approach detailed here (viewable_by attribute) will still have its own scaling limitations.

You will find that the more followers a user has, the less data their post can contain since each row in Algolia has a size limit. So as a user’s followers grow, the space left for the rest of their post data decreases.

If each post in your index is already hovering around the row limit then you will not have much space left for an array of followers anyway.

I struggle with a similar case as I’m working on a social network with optional private accounts. Those private accounts posts should only be accessible to accepted followers of them.
As your second solution would work, I’m afraid of the scalability and wonder if the following solution would be a better approach?
For each user I create a secured api key on my server side that includes a filter with all following accounts that are private (user1,user2,user3) and blocked accounts (user4)

{ "filters": "isPublic:true OR username:user1 OR username:user2 OR username:user3 AND NOT username:user4" }

Appreciate any further thoughts on this :v: