Why does adding a trailing space affect results?

I’m debugging an issue with unexpected search results and I have traced it down to this. The search term is “Spain by”. It returns a result when there is a trailing space, but no result when the trailing space is removed.

I think it has to do with the fact that the last word “by” is a stop word (the index has remove stop words set to “true”).

Can someone explain why this is happening and how to work around it?

This is indeed completely expected.

To prevent this behavior, you might want to play with removeWordsIfNoResults.

Also, you could use a list of optionalWords which would contain your language stop words instead of removeStopWords, which would be a bit less agrressive with stop words.
You can find some pretty good lists for a lot of languages here: https://github.com/6/stopwords-json .

Sorry, I don’t understand.

Can you spell out to me how adding a space relates to stop-words or removeWordsIfNoResults?

Hi @jacob2,

The reason why the trailing space makes a difference is because Algolia does prefix searching, so it will look for any word beginning with the letters “by” before you add the trailing space. Does any of the 5 attributes hidden in your gif hold any such word?

When you add a trailing space, the engine then searched for the exact word “by” instead. As this word is not found in your record, the record is not returned.

This behaviour is expected and there is not much you can change to change the effect of adding a trailing space.

However, this issue arrises because this kind of small words create noise in the query, and prevent you from getting the results you expect. This issue can be fixed and we usually recommend to remove thos noisy words them from the query using one of these settings depending on your use case:

  • removeStopWords to remove common english words likely to create noise like ‘by’, ‘but’, ‘an’, ‘the’ etc.
  • removeWordsIfNoResults if you’d like to widen the search only in case the original query did not return any results
  • optionalWords if you prefer to build your own custom list of optional words

I hope this is clear!

1 Like

Ok, that makes sense. Thanks for the explanation.

I actually do have removeStopWords set to true, so given that “by” is a stop word, why is this still happening?

Hi @jacob2,

Stopwords are never removed when they’re a prefix too.
That’s because, as long as your query ends with “by”, your user may be searching for, e.g. “bycycle” (although there’s a typo, that’s just for the example).
So, in your query without a trailing space, “by” is a prefix, and as such it is not removed.
With a trailing space, it is not a prefix anymore, “by” becomes a stopword, and thus normally it should be removed.

This is the normal behaviour of our search engine, and will usually get you relevant results.
I would not recommend it, but if you wish to disable prefix-matching, you can do so by setting queryType to the value prefixNone.

As an unrelated side-note, you may want to use removeStopWords=["en"] instead of removeStopWords=true if you only need English stopwords, because setting it to true removes all the stopwords from all the languages, which can be extremely greedy and break your relevance.

Best regards,
Joris Valette

Ok, thanks for the detailed explanation. I will think I understand now.