Excluding Negated Results

Hello there, Algolia Community!

I have an index made up of NAICS codes and their descriptions. NAICS is an extensive business classification system (~20k codes/descriptions) developed by the US census department.

They use a common convention in descriptions to exclude certain operations by using the keyword “except”.

This means that if I search my index for “Soybean Farming” I get back two records:

  1. “Soybean farming, field and seed production”
  2. “Oilseed farming (except soybean), field and seed production”

I want to exclude #2 if a keyword is explicitly negated by the “except” keyword.

Anyone ever encounter a similar scenario? Any pointers for me?

Thank you in advance!

-Mike

Hello Mike,

We do not, as of today, have a way to do this out of the box, and this will unfortunately need some processing of the information on your side before uploading it to Algolia.
This solution will unfortunately only work if the structure of the information is easily parsable (i.e. you can extract the “excepted” elements pretty easily.

If I were you I would create a record with such a structure:

{
    "name": "Soybean farming, field and seed production",
    "except": []
},
{
    "name": "Oilseed farming (except soybean), field and seed production",
    "except": ["soybean"]
}

When a user is making a query, you will then have to get the separated words and build a query with a filter:

index.search({
  query: 'Soybean Farming',
  filters: 'NOT except:soybean AND NOT except:farming'
})

This will ensure that the second record will not be shown. Note that for it to work, you will have to add the except attribute to attributesForFiltering in your index settings.

I hope this answers your question, let me know if you need more help!

2 Likes