Alphabetic sorting + textual rank seems broken

First off - sorting on a string value seems to be “unsupported”, at least based on the message you get when you add one to your ranking formula:

Attribute with String Values
title contains string values. The ranking formula expects numerical or boolean values.

This seems nuts, especially for a searching/indexing tool like Algolia. You would be forcing everyone to index a “alpha sort numerical value”.

Luckily, if you ignore the notice, it seems to handle sorting string attributes just fine (at first).

At the very least I think there needs to be some clarification as to if this is supported or not. Your support staff seems to recommend its use, despite the warning message: Sort is not working - Open Q&A - Algolia Community, Sorting Alphabetically on the sample index


My goal

What I’m after is an index that is sorted alphabetically by an attribute.
There will be a text search as well, so I want the results sorted by this attribute first (“sort-by attribute”) followed by the default textual matching.

The problems

The initial sort on a string attribute works. The problems occur when the are combined with a text search:

I’ve found that it seems the default textual ranking criteria , despite being below my sort-by attribute in the ranking formula, seems to throw the rank off.

02 AM

Here’s an example, with the above ranking settings. You’ll notice it starts sorted, but search for “hello”, and you’ll see the rank distort: JSFiddle

Furthermore, I’ve found if I remove the defaults all together, the sorting works as expected, even with a text search.

Example: (search for “hello” again) JSFiddle

In Summary…

It seems like the default textual rankings weight heavier than a sort-by attribute, even if the sort-by is higher in the ranking formula.

Furthermore, I feel like you should document and support alpha sorting, since it “works”, is a pretty basic need, and is already being recommended by support. Fixing the above issues with ranking and removing the warning when you add a string sort-by would be a good start.

Thanks!

Hello @timkelty,

Algolia does not recommend, and for that reason does not easily support, alphabetical sorting as the first ranking criterion.

If you put a “sort by” criterion on a textual attribute as the first step of your ranking formula, it will not work. This is why the dashboard displays a warning when doing so. It will appear to work for the empty query, but as you start typing (just like you noticed), it will break. The reason is that the steps in the ranking formula (other than custom) are dynamic, and sorting on strings at query time would be prohibitively expensive as soon as the index is big enough.

There is a workaround to have alphabetical sorting, which is to have asc(textual_attribute) inside your custom ranking, and custom ranking as the first ranking criterion. In other words:

{
    "customRanking": [ "asc(textual_attribute)" ],
    "ranking": [ "custom" ]
}

(Note that this may not be achievable through the dashboard; you may have to use the API to do so.)

This workaround comes with a caveat: There will be no tie breaking between identical string values, since the remaining criteria of the ranking formula will be disregarded. (That’s why I omitted them in my above example; specifying them is allowed but will have no effect in practice.) That’s because custom ranking is performed at indexing time, so the tie breaking is already performed statically.

Also please note that alphabetical sorting in Algolia is not locale-aware, therefore strings will just be ordered by plain lexicographical order of their Unicode characters. For English text, that may be acceptable; for other languages (in particular with diacritics), less so.

I re-read my answer to the post you mentioned and I agree that it’s misleading. Sorry for that… :grimacing: I will update it right away with a cross-reference to this answer. Let’s hope that it will avoid further misunderstanding.

You may now ask yourself: “Why does Algolia makes it so difficult to sort alphabetically? That’s such a basic need!” The answer is: in most use cases, sorting results alphabetically doesn’t make sense. Don’t get me wrong: I am not saying that your particular use case is invalid! I am just saying that, in the vast majority of cases, sorting alphabetically is useless.

Why? Algolia is a search engine, not a database. When searching text, textual relevance is of utmost importance. That’s why we handle typo tolerance, prefix search, synonyms, plurals, proximity, varying importance of matching attributes… In most use cases, these are the criteria that make the most sense to discriminate between results. Custom ranking is just here to tie breaks between results with identical textual relevance, but different business relevance.

When you put alphabetical sort at the top of your ranking formula, you are disregarding all those criteria: records matching all words without typo and maximum proximity will be treated the same as records matching just a few words with many typos and weak proximity; the ordering will just be alphabetical. An irrelevant result can therefore rank higher than a relevant one.

Such use cases are basically treating Algolia as a database, which it is not. It was not designed to handle them, and that’s why they seem so awkward to implement. If you just need alphabetical sorting of anything that remotely matches a given string, probably a SQL query on your own database with LIKE and ORDER BY would just as well do the trick.

I hope this lengthy explanation makes it clearer why alphabetical sorting is discouraged, and how you can still achieve it (with caveats) if you really, really need it. :slight_smile:

1 Like

@clement.leprovost thanks for comprehensive reply!

For more context on my use-case:

I’m building an InstantSearch widget for a data table with sortable headers. So in this case, alpha sorting does make sense, and the textual search is more of a filter. I agree this is one of the few cases where you would want to sort alphabetically, but it does still seem like a pretty good use-case, and a still a good use of Algolia. I’ve seen a few posts of people asking for a widget like this as well, so I’ll post it when it is ready.

A few things I’m still confused about:

It seems like your api workaround is the same as what I’ve done (via dashboard) in my second screenshot (adding my custom ranking and deleting all the default textual ones). Is that correct?

You said:

There will be no tie breaking between identical string values, since the remaining criteria of the ranking formula will be disregarded. (That’s why I omitted them in my above example; specifying them is allowed but will have no effect in practice.)

If that is true, and the remainder of the formula is disregarded, why do I have to explicitly remove the default ranking rules, even if they are below my custom attribute, in order to maintain alpha sort when there is a search query?

Basically - I’ve gotten the results I need by removing any of the default textual ranking rules. However, what I was expecting was to be able to keep them there, below my attribute, and use them as the tie-break, after the attribute sort. If that’s an impossibility, that’s fine - since what I’m after is really an alpha-sort with the query just being a filter. I just think it is confusing that it will allow you do sort your rules like that, given the results.

It seems like your api workaround is the same as what I’ve done (via dashboard) in my second screenshot (adding my custom ranking and deleting all the default textual ones). Is that correct?

No, it’s not. If the dashboard is displaying “Sort by”, it’s very likely that the criterion is directly in the ranking formula (ranking setting), whereas it should go under the custom ranking (customRanking setting). As far as I know, only manipulating the settings via an API client can lead you to the desired result, and it’s also the only way to confirm with 100% reliability that the settings are correct. (This is basically such an edge case that I don’t think the dashboard can handle it.)

If […] the remainder of the formula is disregarded, why do I have to explicitly remove the default ranking rules?

You shouldn’t need to remove them if you followed the above procedure. It just makes things more explicit. :slight_smile:

Thanks for this thorough explanation, @clement.leprovost. I’ve also been frustrated with trying to sort alphabetically and assuming it was possible, just weirdly hard, but now I see that it’s not a valuable use case.

This in particularly finally made sense to me

The answer is: in most use cases, sorting results alphabetically doesn’t make sense. Don’t get me wrong: I am not saying that your particular use case is invalid! I am just saying that, in the vast majority of cases, sorting alphabetically is useless.

As a new Algolia user, I’ve wasted 100s of hours trying to force algolia into “sort” box. It is not for sorting. It is for “finding” and when I really looked at my use case - sort by title - its only there to allow people to browse my content alphabetically because I assume they’ve failed to “find” the content they’re looking for… that’s a bad assumption based on decades of disappointing SQL based solutions.

Thanks for the enlightenment. :slight_smile:

The solution and details are now in our own documentation: https://www.algolia.com/doc/guides/managing-results/refine-results/sorting/how-to/sort-an-index-alphabetically/