Searching for multiple matches in a single item at once

Hi, we are looking for a way to use Algolia in a use case that doesn’t seem supported, so I’m seeking advice, maybe someone already had a similar use case and found a solution.

About the data

We manage a catalog of products that are “connectors” between cylindrical tubes.
The slots for these tubes have a diameter.
Any product can have up to 10 slots.

Here’s a reduced test case of data, with 3 products having 3 slots each, and 1 product having 4 slots:

[
  {
    "objectID": "product-1",
    "D1": 4.75,
    "D2": 8,
    "D3": 4.75
  },
  {
    "objectID": "product-2",
    "D1": 8,
    "D2": 4.75,
    "D3": 8
  },
  {
    "objectID": "product-3",
    "D1": 8,
    "D2": 4.75,
    "D3": 11.2
  },
  {
    "objectID": "product-4",
    "D1": 3,
    "D2": 4.75,
    "D3": 11.2,
    "D4": 2.8
  }
]

About the search

People using the search would like to find products to connect several tubes.

Each tube has a diameter.

When searching for a connector for a given number of tubes, diameters must match, and products with more slots are fine too.

Let’s say for example we want to perform these searches:

  1. find any product that has a 11.2mm slot
  2. find any product that has a 8mm slot, and a 4.75mm slot
  3. find any product that has a 8mm slot, a 4.75mm slot, and a second 4.75mm slot

Attempt 0: keep data as is

Data

multiple-dimensions-0.json: https://gist.github.com/nhoizey/a087d5b6ae91517eb13f16f8e29ee35b#file-multiple-dimensions-0-json

Searches

Simple for search #1: {"filters":"D1=11.2 OR D2=11.2 OR D3=11.2 OR D4=11.2"}

But complex even for only 2 dimensions. For search #2: {"filters":"(D1=8 AND (D2=4.75 OR D3=4.75 OR D4=4.75)) OR (D2=8 AND (D1=4.75 OR D3=4.75 OR D4=4.75)) OR (D3=8 AND (D1=4.75 OR D2=4.75 OR D4=4.75)) OR (D4=8 AND (D1=4.75 OR D2=4.75 OR D3=4.75))"}

And not supported by Algolia: filters: filter (X AND Y) OR Z is not allowed, only (X OR Y) AND Z is allowed.

See https://www.algolia.com/doc/api-reference/api-parameters/filters/#boolean-operators

Attempt 1: all diameters in one single data structure

Idea: the order of slot diameters (D1, D2, D3, etc.) is not important.

Data

multiple-dimensions-1.json: https://gist.github.com/nhoizey/a087d5b6ae91517eb13f16f8e29ee35b#file-multiple-dimensions-1-json

[
  {
    "objectID": "product-1",
    "D": [8, 4.75, 4.75]
  },
  {
    "objectID": "product-2",
    "D": [4.75, 8, 8]
  },
  {
    "objectID": "product-3",
    "D": [8, 4.75, 11.2]
  },
  {
    "objectID": "product-4",
    "D": [3, 4.75, 2.8, 11.2]
  }
]

Searches

search Algolia filter product-1 product-2 product-3 product-4
11.2 {"filters":"D=11.2"} :+1: :+1:
8.0 and 4.75 {"filters":"D=8 AND D=4.75"} :+1: :+1: :+1:
8.0, 4.75 and 4.75 {"filters":"D=8 AND D=4.75 AND D=4.75"} :+1: :x: :x:

Legend:

  • :+1:: value returned as expected
  • :x:: returned value that should not be returned

This error is normal, nothing indicates that D=4.75 and D=4.75 are “different”

Attempt 2: compute possible combinations in several objects in a single field

Idea: Keep numbered slots, but compute all possible combinations in the index

Data

multiple-dimensions-2.json: https://gist.github.com/nhoizey/a087d5b6ae91517eb13f16f8e29ee35b#file-multiple-dimensions-2-json

[
  {
    "objectID": "product-1",
    "D": [
      {
        "D1": 8,
        "D2": 4.75,
        "D3": 4.75
      },
      {
        "D1": 4.75,
        "D2": 8,
        "D3": 4.75
      },
      {
        "D1": 4.75,
        "D2": 4.75,
        "D3": 8
      }
    ]
  },
  {
    "objectID": "product-2",
    "D": [
      {
        "D1": 4.75,
        "D2": 8,
        "D3": 8
      },
      {
        "D1": 8,
        "D2": 4.75,
        "D3": 8
      },
      {
        "D1": 8,
        "D2": 8,
        "D3": 4.75
      }
    ]
  },
  …
]

In theory number of combinations = factorial of the number of dimensions.

product-4 has 24 possible combinations. But here product-1 et product-2 have two identical values, so twice less combinations.

Searches

search Algolia filter product-1 product-2 product-3 product-4
11.2 {"filters":"D.D1=11.2"} :+1: :+1:
8.0 and 4.75 {"filters":"D.D1=8 AND D.D2=4.75"} :+1: :+1: :+1:
8.0, 4.75 and 4.75 {"filters":"D.D1=8 AND D.D2=4.75 AND D.D3=4.75"} :+1: :x: :x:

Legend:

  • :+1:: value returned as expected
  • :x:: returned value that should not be returned

The issue here, with product-2 for example, is that D.D2=4.75 matches on { "D1": 8, "D2": 4.75, "D3": 8 } while D.D3=4.75 matches on the different { "D1": 8, "D2": 8, "D3": 4.75 }, but Algolia selects product-2 as matching anyway.

Attempt 3: compute possible combinations in several objects in dedicated fields

Data

multiple-dimensions-3.json: https://gist.github.com/nhoizey/a087d5b6ae91517eb13f16f8e29ee35b#file-multiple-dimensions-3-json

[
  {
    "objectID": "product-1",
    "DC1": {
      "D1": 8,
      "D2": 4.75,
      "D3": 4.75
    },
    "DC2": {
      "D1": 4.75,
      "D2": 8,
      "D3": 4.75
    },
    "DC3": {
      "D1": 4.75,
      "D2": 4.75,
      "D3": 8
    }
  },
  …
]

Searches

search Algolia filter product-1 product-2 product-3 product-4
11.2 {"filters":"DC1.D1=11.2 OR DC2.D1=11.2 OR DC3.D1=11.2 OR DC4.D1=11.2"} :+1: :+1:
8.0 and 4.75 {"filters":"(DC1.D1=8 AND DC1.D2=4.75) OR (DC2.D1=8 AND DC2.D2=4.75) OR (DC3.D1=8 AND DC3.D2=4.75) OR (DC4.D1=8 AND DC4.D2=4.75) OR …"} error
8.0, 4.75 and 4.75 {"filters":"(DC1.D1=8 AND DC1.D2=4.75 AND DC1.D3=4.75) OR …"} error

Search #2 would need 24 times (DCn.D1=8 AND DCn.D2=4.75), search #3 would need 24 times (DCn.D1=8 AND DCn.D2=4.75 AND DCn.D3=4.75).

It’s anyway no supported by Algolia, like for attempt #0.

Attempt 4 : keep separate values, but order them by increasing value

Data

A lot less data to manage, as much as the source.

multiple-dimensions-4.json: https://gist.github.com/nhoizey/a087d5b6ae91517eb13f16f8e29ee35b#file-multiple-dimensions-4-json

[
  {
    "objectID": "product-1",
    "D1": 4.75,
    "D2": 4.75,
    "D3": 8
  },
  …
]

Searches

search Algolia filter product-1 product-2 product-3 product-4
11.2 {"filters":"D1=11.2 OR D2=11.2 OR D3=11.2 OR D4=11.2"} :+1: :+1:
8.0 and 4.75 {"filters":"(D1=4.75 AND (D2=8 OR D3=8 OR D4=8)) OR (D2=4.75 AND (D3=8 OR D4=8)) OR (D3=4.75 AND D4=8)"} error
8.0, 4.75 and 4.75 {"filters":"…"} error

Search string a little simpler because values are ordered. For search #2, not need for checking D1 and D2 for the 8 value if D3=4.75 matches.

But the same limitation of Algolia occurs with such mix of AND and OR.

Attempt 5 : combinations in strings, with increasing values

Data

multiple-dimensions-5.json: https://gist.github.com/nhoizey/a087d5b6ae91517eb13f16f8e29ee35b#file-multiple-dimensions-5-json

[
  {
    "objectID": "product-1",
    "D": "ø4.75øø4.75øø8ø"
  },
  {
    "objectID": "product-2",
    "D": "ø4.75øø8øø8ø"
  },
  {
    "objectID": "product-3",
    "D": "ø4.75øø8øø11.2ø"
  },
  {
    "objectID": "product-4",
    "D": "ø2.8øø3øø4.75øø11.2ø"
  }
]

We use ø as a separator toprevent Algolia from cutting it like with punctuation.

More compact format, but a lot less readable.

Searches

With SQL, we could use LIKE with % to match parts of the string.

But Algolia does not know how to search inside a string, even less with characters in the interval.

For example, it is impossible to search for the strings ø3ø and ø11.2ø inside ø2.8øø3øø4.75øø11.2ø

See the docs:

Attempt 6?

We have no more ideas, unfortunately.

Any help is welcome. :pray:

Attempt 6: number of values with the value itself

To prevent issue with multiple equal values we have in attempt #1, we store the number of values with the value, like "2-4.75". To allow people looking for only one of this value, we also add "1-4.75".

This solution was provided by @sylvain.huprelle from Algolia, I forgot about it… :man_facepalming:

Data

multiple-dimensions-6.json: https://gist.github.com/nhoizey/a087d5b6ae91517eb13f16f8e29ee35b#file-multiple-dimensions-6-json

[
  {
    "objectID": "product-1",
    "D": ["1-8", "1-4.75", "2-4.75"]
  },
  {
    "objectID": "product-2",
    "D": ["1-4.75", "1-8", "2-8"]
  },
  {
    "objectID": "product-3",
    "D": ["1-8", "1-4.75", "1-11.2"]
  },
  {
    "objectID": "product-4",
    "D": ["1-3", "1-4.75", "1-2.8", "1-11.2"]
  }
]

Searches

search Algolia filter product-1 product-2 product-3 product-4
11.2 {"filters":"D:'1-11.2'"} :+1: :+1:
8.0 and 4.75 {"filters":"D:'1-8' AND D:'1-4.75'"} :+1: :+1: :+1:
8.0, 4.75 and 4.75 {"filters":"D:'1-8' AND D:'2-4.75'"} :+1:

It works!

So, it works for such fix values for slot diameters.

But! What we need now is to be able to deal with two kinds of ranges:

  • if the users search for a 7 to 9mm range, they must find the 8mm slot
  • the actual products have slots that can accept a range of tubes, so for example a 4.6 to 4.8mm range for the slot can accept a 4.75mm tube
  • if the users search for a 4.4 to 4.7mm range, they should find the product with this 4.6 to 4.8mm range slot, as there’s an overlap

With ranges, the string like '2-4.75' is not possible anymore.

Let’s try to deal with discrete search values with ranges in slot diameters first.

Data

[
  {
    "objectID": "product-1",
    "D1": { "min": 4.7, "max": 4.8 },
    "D2": 8,
    "D3": 4.75
  },
  {
    "objectID": "product-2",
    "D1": 8,
    "D2": { "min": 4.7, "max": 4.8 },
    "D3": { "min": 7.5, "max": 8.5 }
  },
  {
    "objectID": "product-3",
    "D1": 8,
    "D2": 4.75,
    "D3": 11.2
  },
  {
    "objectID": "product-4",
    "D1": 3,
    "D2": 4.75,
    "D3": 11.2,
    "D4": 2.8
  }
]

Attempt 7: store rounded integers of discrete and range values in strings

We can’t store all values

For example, the { "min": 7.5, "max": 8.5 } range gives "1-7" and "1-8".

Data

[
  {
    "objectID": "product-1",
    "D1": { "min": 4.7, "max": 4.8 },
    "D2": 8,
    "D3": 4.75,
    "D": ["1-4", "2-4", "1-8"]
  },
  {
    "objectID": "product-2",
    "D1": 8,
    "D2": { "min": 4.7, "max": 4.8 },
    "D3": { "min": 7.5, "max": 8.5 },
    "D": ["1-8", "2-8", "1-4", "1-7"]
  },
  {
    "objectID": "product-3",
    "D1": 8,
    "D2": 4.75,
    "D3": 11.2,
    "D": ["1-8", "1-4", "1-11"]
  },
  {
    "objectID": "product-4",
    "D1": 3,
    "D2": 4.75,
    "D3": 11.2,
    "D4": 2.8,
    "D": ["1-3", "1-4", "1-2", "1-11"]
  }
]

Searches

The search should return at least the right products, maybe more, which need to be filtered out in the front-end (legend :hourglass_flowing_sand:).

search Algolia filter product-1 product-2 product-3 product-4
11.2 {"filters":"D:'1-11'"} :+1: :+1:
8.0 and 4.75 {"filters":"D:'1-8' AND D:'1-4'"} :+1: :+1: :+1:
8.0, 4.75 and 4.75 {"filters":"D:'1-8' AND D:'2-4'"} :+1:
8.2, 4 {"filters":"D:'1-8' AND D:'1-4'"} :hourglass_flowing_sand: :hourglass_flowing_sand: :hourglass_flowing_sand:
8.2, 4.75 {"filters":"D:'1-8' AND D:'1-4'"} :hourglass_flowing_sand: :+1: :hourglass_flowing_sand:

It “works”, but filtering out in the front-end means we lose accuracy of facets volumes, and pagination.

Rounding values to 1/10th instead of integers would generate much more data, but less “bad” results.

Attempt 8: store rounded 1/10th of discrete and range values in strings

For example, the { "min": 7.5, "max": 8.5 } range gives "1-7.5", "1-7.6", "1-7.7", "1-7.8", "1-7.9", "1-8", "1-8.1", "1-8.2", "1-8.3", "1-8.4" and "1-8.5".

Data

[
  {
    "objectID": "product-1",
    "D1": { "min": 4.7, "max": 4.8 },
    "D2": 8,
    "D3": 4.75,
    "D": ["1-4.7", "2-4.7", "1-4.8", "1-8"]
  },
  {
    "objectID": "product-2",
    "D1": 8,
    "D2": { "min": 4.7, "max": 4.8 },
    "D3": { "min": 7.5, "max": 8.5 },
    "D": [
      "1-8",
      "1-4.7",
      "1-4.8",
      "1-7.5",
      "1-7.6",
      "1-7.7",
      "1-7.8",
      "1-7.9",
      "2-8",
      "1-8.1",
      "1-8.2",
      "1-8.3",
      "1-8.4",
      "1-8.5"
    ]
  },
  {
    "objectID": "product-3",
    "D1": 8,
    "D2": 4.75,
    "D3": 11.2,
    "D": ["1-8", "1-4.7", "1-11.2"]
  },
  {
    "objectID": "product-4",
    "D1": 3,
    "D2": 4.75,
    "D3": 11.2,
    "D4": 2.8,
    "D": ["1-3", "1-4.7", "1-2.8", "1-11.2"]
  }
]

Searches

search Algolia filter product-1 product-2 product-3 product-4
11.2 {"filters":"D:'1-11.2'"} :+1: :+1:
8.0 and 4.75 {"filters":"D:'1-8' AND D:'1-4.7'"} :+1: :+1: :+1:
8.0, 4.75 and 4.75 {"filters":"D:'1-8' AND D:'2-4.7'"} :+1:
8.2, 4 {"filters":"D:'1-8.2' AND D:'1-4'"}
8.2, 4.75 {"filters":"D:'1-8.2' AND D:'1-4.7'"} :+1:
8, 4.77 {"filters":"D:'1-8' AND D:'1-4.7'"} :+1: :+1: :hourglass_flowing_sand:

This is much better. We could use a 1/100th rounding to limit even more, if data volume is not an issue.

Let’s try with users looking for ranges.

About the data

Same as before

Attempt 9: store rounded integers of discrete and range values

Data

Same as attempt #7

Searches

search Algolia filter product-1 product-2 product-3 product-4
11-11.4 {"filters":"D:'1-11'"} :+1: :+1:
8.0 and 4.5-5 {"filters":"D:'1-8' AND (D:'1-4' OR D:'1-5')"} :+1: :+1: :+1:
8.0, 4.75 and 4.5-5 {"filters":"D:'1-8' AND (D:'2-4' OR (D:'1-4' AND D:'1-5'))"} error
8.1-8.5, 4 {"filters":"D:'1-8' AND D:'1-4'"} :hourglass_flowing_sand: :hourglass_flowing_sand: :hourglass_flowing_sand:
8.2, 8.1-8.5 and 4.75 {"filters":"D:'2-8' AND D:'1-4'"} :hourglass_flowing_sand:
7.1-9.5 {"filters":"D:'1-7' OR D:'1-8' OR D:'1-9'"} :+1: :+1: :+1:

We once again get the error because {"filters":"D:'1-8' AND (D:'2-4' OR (D:'1-4' AND D:'1-5'))"} mixes OR and AND in a way Algolia doesn’t support.