Intersection-like search

Hi there! I am new to this forum.

There is a question I was thinking of for several weeks.

Assume you have an app with a predefined list of products. Each user has the possibility to select these products and build up a set of his own products.

This means that every user has a set of products. The question is now if it is possible to search for different products (at once!) and receive the sets that have most of the products in common - e.g. where the intersection has the highest cardinality.

Example:

set_of_user_1: [ 'product_1', 'product_2', 'product_3'],
set_of_user_2: [ 'product_3', 'product_5', 'product_9'],
set_of_user_3: [ 'product_4', 'product_7', 'product_9'],

Then I want to search for:

['product_3', 'product_2'] 

(is this possible?)
which should return

['set_of_user_1', 'set_of_user_2']

in this order. I am hoping there is a solution for this.

Regards,
LJ1001

1 Like

Hi,

I think the best way would be to precompute all these data in your backend, before you store them. Something like

users_per_product: [
    'product_2': ['set_of_user_1'],
    'product_3': ['set_of_user_1', 'set_of_user_2'],
]

Then you should be able to use users_per_product as a disjonctive facet.

Does that answer your question?

1 Like

Hi @julienbourdeau,

I also thought of this approach. But a record is restricted by its size, right?
So if the number of sets grows the record might become too big eventually.

Also, what if the product has several other attributes which are relevant for the results and therefore need to contemplated?

After skimming through your docs several times, I have an idea (I just wanted to hear an open-minded answer in the first place :slight_smile: ).

Regarding the problem from above (but the products have now more attributes), would it be possible to create an index like so:

"user_products" : [
{
    "name": "product_1",
    "length": 55,
    "category": "C",
    "set": "set_of_user_1"
},
{
    "name": "product_2",
    "length": 97,
    "category": "A",
    "set": "set_of_user_1"
},
// [...]
 
]

If I have understood faceting right, then if I set the attributes for faceting on “set” and on “name”, then I could do a disjunctive faceting on the name attribute

"product_2" OR "product_3"

and (with the right UI) I see the sets with most records first as they are more relevant.
Is there any mistake in my logic?

Regards,
LJ1001