Hyphenated attribute search, which one is better

I have read this article: Searching in Hyphenated Attributes | How to | Managing Results | Guide | Algolia Documentation and think of these way to index the data:

  • Remove all hyphens and add as searchable attribute
  • Extract all hyphenated words and add it as synonyms in settings

Which way has better performance?

Hi there,

Synonyms are added and optimized at indexing time, meaning you won’t perceive any performance difference. However, depending on what you do, this could affect relevance and lead to surprising or undesirable results.

As the documentation you linked suggests, the engine uses separators to tokenize the string, but the separators are then discarded. If you’re willing to let users search terms without the separators, or perform substring matching, the easiest way is to index alternatives and suffixes.

Could you maybe share your use case and why you’re considering synonyms?

Hi,

We have a searchable attribute that takes name of cars like Honda CR-V or Madza CX-8. We want when users search for CRV or CR-V or CR V or CR, the records with Honda CR-V will match. If we use synonyms there will be thousands of them so I think I can slowdown the search.

Hi @thungthudh, you could add another attribute to your records such as ‘keywords’ (you can name it whatever you like), and add each variation to the keywords attribute. Then make that new attribute a searchable attribute. This will allow the record to be found for either of the variations:

“keywords”: [‘CRV’, ‘CR-V’, ‘CR V’, ‘CR’]

You may also want to set the hyphen as a separatorsToIndex so that the hyphen is recognized by the search engine.