🍞 Get records that may / may not have GLUTEN FREE but not just GLUTEN

I’m developing a product that helps restaurants serve customers with food allergies - the mission is to hide foods that have certain allergens from search results so that customers cannot see or order those items. To achieve this I originally thought I should label each food item with boolean fields for the type of allergens it contained. So, for example, the 8 ingredients that cause 99% of allergens are Egg, Fish, Milk, Peanut, Shellfish, Soy, Tree nuts, and Wheat so each record would have the following fields:

image

But there’s a couple problems with this approach:

  • Food vendors don’t label foods this way
  • It’s time consuming / costly to label it this way since I have over 240,000 food products
  • The approach breaks apart if a new allergen needs to be flagged later (since all products would have to be reviewed for the allergen)

So I think the smarter and more flexible way to complete my mission is to use a search engine since:

  • My records already contain searchable ingredients, and
  • New allergens / unanticipated filtering constraints can be flexibly handled

I’ve already indexed the ~240k records in Algolia where each record has a list of ingredients and I’m able to exclude specific records from a search by using the Advanced search syntax

image

Now say a user wants to avoid gluten and dairy products:

image

  1. How can I structure a search to exclude records that contain gluten… that is, how do I show the records that may contain gluten free but that do not contain gluten, bread, wheat, barley, and rye?
  2. How can I chain together multiple queries like the one above to filter out multiple allergens, like say gluten AND dairy AND something else?

With advanced querying it is possible to search

vegetarian -gluten (give me vegetarian food without gluten… INCLUSIVE+EXCLUSIVE) or
-gluten (give me any food without gluten… SINGLE EXCLUSIVE)

But it is not possible to search

-gluten -soy -peanut (give me any food without gluten, soy or peanut… MULTIPLE EXCLUSIVE).

As documented here the way to implement a MULTIPLE EXCLUSIVE search is to use faceted search for queries. So, my last question;

  1. How can I satisfy my first two questions in the event that the user wants a MULTIPLE EXCLUSIVE search? Would faceting still be able to handle this and if so how do I implement that?

If I can figure this out I will become a very happy Algolia customer.

EDIT
I came across this help page that offers a solution to my problem. Sadly this solution requires that each allergen is already pre-known and labeled within a dedicated allergen attribute and as I said above, I have over 240k products so labeling each would be costly. Can anyone suggest an alternative solution?

Hi @aagostini,

Thanks for reaching out! From your description it sounds like your current records, simplified, have a string list of ingredients like:

{
    "name": "my meal",
    "ingredients":  "vegetables, walnuts, cheese, croutons"
}

I understand that at times there can be hesitation with regards to reshaping records and reindexing, due to the number of operations. But in this case it will be needed because “advancedSyntax” will only get you to your current spot.

In order to achieve your INCLUSIVE+EXCLUSIVE or MULTIPLE EXCLUSIVE you will have to enrich your records with more data, using either boolean fields (as you described) or _tags. They would achieve the same goal.

For example, I’ll illustrate both in the sample records below:

{
   "objectID": 1,
   "name": "my vegetarian, gluten free recipe",
   "_tags": ["vegetarian", "gluten_free"],
   "isVegetarian": true,
   "isGluten": false,
   "isSoy": false,
   "isPeanut": false
},
{
    "objectID": 2,
   "name": "my vegetarian, gluten recipe",
   "_tags": ["vegetarian", "gluten"],
   "isVegetarian": true,
   "isGluten": true,
   "isSoy": false,
   "isPeanut": false
},
{
    "objectID": 3,
   "name": "my vegetarian, gluten, soy, peanut recipe",
   "_tags": ["vegetarian", "gluten", "soy", "peanut"],
   "isVegetarian": true,
   "isGluten": true,
   "isSoy": true,
   "isPeanut": true
},
{
    "objectID": 4,
   "name": "my gluten, soy, peanut recipe",
   "_tags": ["gluten", "soy", "peanut"],
   "isVegetarian": false,
   "isGluten": true,
   "isSoy": true,
   "isPeanut": true
}

You can apply the following Boolean Filters to achieve your goals and they would each only return objectID 1 the “vegetarian, gluten-free” recipe:

  • vegetarian -gluten
    => { "filters": "vegetarian AND NOT gluten" } – or –
    => { "filters": "isVegetarian=1 AND NOT isGluten=1" }

  • -gluten
    => { "filters": "NOT gluten" } – or –
    => { "filters": "NOT isGluten=1" }

  • -gluten -soy -peanut
    => { "filters": "NOT gluten AND NOT soy AND NOT peanut" } – or –
    => { "filters": "NOT isGluten=1 AND NOT isSoy=1 AND NOT isPeanut=1" }

I have a personal preference for “_tags” because of the cleaner syntax, but either way works.

I hope this helps. Let us know how it goes.

1 Like

Thanks so much for getting back to me! I’m going to think through this and see what I can do with my mountain of records. Will report back with anything good I find for the community.

1 Like

Hey Ajay,

Is there a way to do set algebra on Algolia search results? Like can Algolia be used to subtract a set of search results A from a set of search results B such that I’d get a set C that has all of the records that were in B but not in A? If so then that would completely solve my enrichment problem. In example I would get all of my “Gluten Free” records with the following set algebra;

I’d imagine that calculating the records in that manner would involve executing 7 searches simultaneously, doing the set algebra once each of the 7 resulting sets became available, then finally subtracting those from the universe set.

And for milk free foods;

This method of getting specific records in Algolia would be much better than having to do offline enrichment since it’s really flexible to new rules and such. After all the records already contain everything one would need to know to spot a food allergy. If Algolia could return records for me this way it would be simple to add rules to filter out records for Celery, Crustaceans, Eggs, Fish, Lupin, Mollusks, Mustard, Peanuts, Sesame Seeds, Soy, Sulfur Dioxide/Sulfites, or Tree Nut allergies.

Is something like that possible? If not, do you guys have any roadmap for adding that functionality? If not, what amount of money would I have to give you to develop it?

Hey @aagostini ,

Thanks for coming back with the detail and images - the goal is clear and makes sense.

As you already explained, you can achieve your goal with Algolia by performing multiple searches. In the case of your examples, you perform X searches with Algolia => track records locally => then perform diffs to get your final results set. However, there is no plan to add a layer of functionality like set theory (e.g., “set algebra”) because it can already be done, albeit manually!

As for development, we don’t do in-house development however you can consider two approaches if you think your implementation is getting to an advanced state where you would like more guidance:

  • Enterprise plan - Still build it all yourself with hands-on coding by you and your team, but an Enterprise plan will get you onboarding technical guidance specific to your implementation from a Solutions Engineer. If interested, feel free to write into support@algolia.com and we’ll be happy to connect you with the right people
  • Partners - Look for an Algolia partner agency to build it for you here. Depending on the features you need, it still may get you into the Enterprise plan, but that’s TBD.

As always, hope the above helps!