Custom ranking "scores" for attributes

I know you can assign custom rankings for a record, the example in the docs is:

{
    "name": "iPhone 6",
    "units_sold": 200
}

where units_sold can be used to rank records.

I am wondering if there is a way to assign “scores” to attributes and use as similar custom ranking, such as:

{
    "name": "Ryan",
    "interests": [
        { name: 'Photography', score: 5 },
        { name: 'Hiking', score: 2 },
        { name: 'Music', score: 17 }
    ]
},
{
    "name": "Brandon",
    "interests": [
        { name: 'Computers', score: 4 },
        { name: 'Comic Books', score: 8 },
        { name: 'Music', score: 9 }
    ]
}

In this case, the interests attribute is searchable and searching for Music would return both results, but since Ryan likes Music more I would like them to show up before Brandon.

My first thought was making interests.name searchable and interests.score a custom ranking, but I’m not sure that would work. I also noticed there’s a “Add a sort-by attribute” (I think this is new) - would that allow me to sort anyone who matches Music by their Music score?

Hopefully this makes sense, I would appreciate any insight into how to make this work.

Thanks!

Hi there,

Thanks for posting here!
If I were to sum up your use case, you are trying to change the ranking depending on the search performed.

TL;DR

While not out-of-the-box, Algolia is flexible enough to implement something close to this. This will ask for some work unfortunately.

Why it’s not out of the box

Let’s be blunt, this is not something Algolia can do. The reason is that ranking objects is an expensive operation, that scales badly the more objects you have in your index. Because of this, most of the ranking is performed at indexing time, and it is not possible to change the criterion while querying your index.

It is possible to achieve something close to this, but this will multiply your number of records, because you will have to use replicas, and is not praticable with a high number of interests, or if you don’t know them in advance.

The idea would be to have a replica for each of the possible interests: music_desc, photography_desc, etc. with "customRanking": ["desc(interests.music)"] or "customRanking": ["desc(interests.photography)"] respectively.

You have then at least two ways to go, and easy one, and a less easy one.

Easy way

When your user is making a search, you could try to spot in your front-end relevant keywords such as "music", "photography", etc. and route the algolia query to the corresponding index.
The is rather easy to implement, but you will not be able to handle typos like musci, and such.

Less easy way

This method leverage Algolia to also find the right index to target.

You would need an additional index, responsible of mapping those replicas indexes to search terms. The structure of this index could be:

[{
    "term": "music",
    "index": "music_desc"
}, {
    "term": "photography",
    "index": "photography_desc"
}]

with "searchableAttributes": ["term"]".

When a user is searching, you will perform two queries back to back.

  1. A first query to this additional index, with [all the words of the query marked as optional] (https://www.algolia.com/doc/api-client/python/parameters/optionalWords/#doing-an-or-between-all-words-of-a-query). This will give you a set of indexes to which perform the actual query, to get the right ranking.

  2. Perform the search in a regular way against the index found thanks to the first query.

If we take an example:

  1. User type "Max loves muisc Portland" (note the typo)

  2. Perform a query with the following parameters:

    {
        "query": "Max loves muisc Portland",
        "optionalWords": "Max loves muisc Portland"
    }
    
  3. Thanks to typo tolerance, the record {"term": "music", "index": "desc_music"} is found

  4. You perform a query to the index desc_music with "Max loves muisc Portland"

  5. You display the results that will be ranked by best interest in music!

Thank you for your reply,

Unfortunately we have over 6K interests, more than 500K users and are already using several slave indexes to sort by other metrics, so that solution is not feasible for our use-case.

We will look into some other options to handle this.

Thanks again for your help.