🍞 Get records that may / may not have GLUTEN FREE but not just GLUTEN


#1

I’m developing a product that helps restaurants serve customers with food allergies - the mission is to hide foods that have certain allergens from search results so that customers cannot see or order those items. To achieve this I originally thought I should label each food item with boolean fields for the type of allergens it contained. So, for example, the 8 ingredients that cause 99% of allergens are Egg, Fish, Milk, Peanut, Shellfish, Soy, Tree nuts, and Wheat so each record would have the following fields:

image

But there’s a couple problems with this approach:

  • Food vendors don’t label foods this way
  • It’s time consuming / costly to label it this way since I have over 240,000 food products
  • The approach breaks apart if a new allergen needs to be flagged later (since all products would have to be reviewed for the allergen)

So I think the smarter and more flexible way to complete my mission is to use a search engine since:

  • My records already contain searchable ingredients, and
  • New allergens / unanticipated filtering constraints can be flexibly handled

I’ve already indexed the ~240k records in Algolia where each record has a list of ingredients and I’m able to exclude specific records from a search by using the Advanced search syntax

image

Now say a user wants to avoid gluten and dairy products:

image

  1. How can I structure a search to exclude records that contain gluten… that is, how do I show the records that may contain gluten free but that do not contain gluten, bread, wheat, barley, and rye?
  2. How can I chain together multiple queries like the one above to filter out multiple allergens, like say gluten AND dairy AND something else?

With advanced querying it is possible to search

vegetarian -gluten (give me vegetarian food without gluten… INCLUSIVE+EXCLUSIVE) or
-gluten (give me any food without gluten… SINGLE EXCLUSIVE)

But it is not possible to search

-gluten -soy -peanut (give me any food without gluten, soy or peanut… MULTIPLE EXCLUSIVE).

As documented here the way to implement a MULTIPLE EXCLUSIVE search is to use faceted search for queries. So, my last question;

  1. How can I satisfy my first two questions in the event that the user wants a MULTIPLE EXCLUSIVE search? Would faceting still be able to handle this and if so how do I implement that?

If I can figure this out I will become a very happy Algolia customer.

EDIT
I came across this help page that offers a solution to my problem. Sadly this solution requires that each allergen is already pre-known and labeled within a dedicated allergen attribute and as I said above, I have over 240k products so labeling each would be costly. Can anyone suggest an alternative solution?


#2

Hi @aagostini,

Thanks for reaching out! From your description it sounds like your current records, simplified, have a string list of ingredients like:

{
    "name": "my meal",
    "ingredients":  "vegetables, walnuts, cheese, croutons"
}

I understand that at times there can be hesitation with regards to reshaping records and reindexing, due to the number of operations. But in this case it will be needed because “advancedSyntax” will only get you to your current spot.

In order to achieve your INCLUSIVE+EXCLUSIVE or MULTIPLE EXCLUSIVE you will have to enrich your records with more data, using either boolean fields (as you described) or _tags. They would achieve the same goal.

For example, I’ll illustrate both in the sample records below:

{
   "objectID": 1,
   "name": "my vegetarian, gluten free recipe",
   "_tags": ["vegetarian", "gluten_free"],
   "isVegetarian": true,
   "isGluten": false,
   "isSoy": false,
   "isPeanut": false
},
{
    "objectID": 2,
   "name": "my vegetarian, gluten recipe",
   "_tags": ["vegetarian", "gluten"],
   "isVegetarian": true,
   "isGluten": true,
   "isSoy": false,
   "isPeanut": false
},
{
    "objectID": 3,
   "name": "my vegetarian, gluten, soy, peanut recipe",
   "_tags": ["vegetarian", "gluten", "soy", "peanut"],
   "isVegetarian": true,
   "isGluten": true,
   "isSoy": true,
   "isPeanut": true
},
{
    "objectID": 4,
   "name": "my gluten, soy, peanut recipe",
   "_tags": ["gluten", "soy", "peanut"],
   "isVegetarian": false,
   "isGluten": true,
   "isSoy": true,
   "isPeanut": true
}

You can apply the following Boolean Filters to achieve your goals and they would each only return objectID 1 the “vegetarian, gluten-free” recipe:

  • vegetarian -gluten
    => { "filters": "vegetarian AND NOT gluten" } – or –
    => { "filters": "isVegetarian=1 AND NOT isGluten=1" }

  • -gluten
    => { "filters": "NOT gluten" } – or –
    => { "filters": "NOT isGluten=1" }

  • -gluten -soy -peanut
    => { "filters": "NOT gluten AND NOT soy AND NOT peanut" } – or –
    => { "filters": "NOT isGluten=1 AND NOT isSoy=1 AND NOT isPeanut=1" }

I have a personal preference for “_tags” because of the cleaner syntax, but either way works.

I hope this helps. Let us know how it goes.


#3

Thanks so much for getting back to me! I’m going to think through this and see what I can do with my mountain of records. Will report back with anything good I find for the community.