Ranking based on Score / array size

Hi, I’ve looked through the docs and couldn’t find an answer but I am new to Algolia and sorry if the solution is obvious.

I have an idex like this:
[ { "name": "My program 1", "exerciseIds": [ "e05l", "e01a", "e01m", "e01o", "e087" ], }, { "name": "My program 2", "exerciseIds": [ "e01m", "e0c9", "e0cp", ], }, ]

I do a filter of this kind:
(exerciseIds:e01a OR exerciseIds:e087 OR exerciseIds:e0cp) and sumOrFiltersScores: true

Which gives me all the records that contain the queried terms, ranked by the number of matches in the exerciseIds array. But this solution promotes records with larger arrays, as they will tend to get a higher score.

What I would like to get is a ranking based on the percentage of matches, or in other terms, based on the individual score divided by the record’s array length (“exerciseIds” here).

Is that possible?

Hi Louis,

To my knowledge, this is not something that you could achieve with Algolia through search parameters. What I understand is that you have multiple occurrences of the same value in your array. On possible solution, is to deduplicate those values. The sorting could be different and closer to what you’re looking for.

Hi Octave,
Each filter value is present at most once in the record’s array, there is no duplicate.
So it is not possible to have a ranking that is a function of the score and a record’s property ?
(the array’s length could be a separate value in the record if that helps)

I see! In that case, one possible workaround would be to have a new attribute on your records that represents the array length. Having so would allow you to sort by ascending value, meaning the shortest arrays would appear higher.

I am not certain how that would combine with the sumOrFiltersScores option so the best would be that you try it out and see if that matches what you expect in terms of relevancy.

Well actually I would like the array’s length to have zero influence in the ranking.
But in the current situation, since the score is simply the sum of matches, records with longer arrays naturally get higher scores. Which is why I would like to divide the score by the array’s length, so as to have a relative score value.

Hi @louis.deveseleer, we do not have dynamic attributes or way to rank based on a dynamic value at search time. I’m not sure I completely understand your use case, but would it be possible to calculate this value at indexing time and add an attribute with the percentage value?

Hi @cindy.cullen
It is not possible to calculate the value at indexing time, only at search time.
Use case:
The records are Baskets of fruits. Each Basket contains a varying amount of fruits.
I have a list of fruits with me, and I want to retrieve all the baskets that contain fruits that are on my list. I want to see first the baskets that contain only fruits from the list, whether it’s a big or a small basket. Then baskets that contain mostly fruits that are on my list, and one or two that aren’t, etc…
Since Algolia can give me all Baskets that contain any fruit from the list, and even tell me which ones contain the most fruits from the list (score), there is only one extra step needed : divide that score by the array’s length and sort them according to this value.
Is that clear? Sorry for my bad explanations.
But from what you’re saying it seems like that’s not possible, and I will have to calculate that after receiving the results.

Hi @louis.deveseleer, thanks for the clarification. Since it’s not possible at indexing time, then, yes, you will need to retrieve the results and sort them in your code at search time.

Ok, thank you @cindy.cullen and @octave.raimbault for your help answering my question!

1 Like