Hello! I apologize if I’m asking something incredibly obvious, but even after scouring the Internet thoroughly, I was unable to find an answer to my question.
I discovered Algolia a few days ago and absolutely love the user-friendliness of it so far! I want to propose using Algolia instead of Elasticsearch in my company for future projects, but before I can even start thinking about that, I’d need to know how to perform one extremely common task that we do in Elasticsearch for nearly all of our clients.
That extremely common request is being able to find search results containing kanji by actually searching only by the kanji’s reading, either written in hiragana or katakana.
For example, a simple sentence of 「今日はいい天気ですね。」 should be searchable by searching for 「きょう」 or 「てんき」 (in case of hiragana), and 「キョウ」 or 「テンキ」 (in case of katakana), even though these characters are not present in the original sentence at all. In Elasticsearch, I believe we were leveraging the power of Kuromoji plugin to do this sort of substitution.
I was wondering if this sort of text processing is possible in Algolia as well? I can think of two possibilities right now, both with their own drawbacks:
- Attempt to convert the entire string from kanji to kana (or both kanas) before storing them in Algolia, and store each string in three separate variants: kanji, hiragana, katakana. I wonder if that would affect the accuracy of search results in case user searches for 「天気」 but 「きょう」 (one term in kanji, one term in hiragana), since each of the three string variants would have different priority/preference configured in Algolia, even though the priority should be the same.
- “Teach” Algolia of various kanji reading by uploading a massive list of synonyms. But would that really work?
Are there better options than these two? Thank you so much for answering!