`distinct` and `facetingAfterDistinct` returning unexpected counts

We have an index containing product variants with a distinct attribute of ‘productId’.

When filtering by ‘color:black’ and ‘size:large’ within the Algolia ‘browse’ UI (i.e. we’re asking the question “Get all product variants which are black AND large”) with distinct: true and facetingAfterDistinct: true we get back different counts for the ‘large’ and ‘black’ facet values whereas I’d expect them to be the exact same?

e.g.


I’m experiencing this both through the Algolia ‘browse’ UI as well as our custom UI. Are you able to give me a pointers as to where I might be going wrong with this, perhaps it’s a problem with my records?

Hi @richard.scarrott,

Thanks for contacting Algolia!

Would you be able to grant Algolia support read access at this link so that we can better see your data and configuration?

Please let us know when this is done and we are happy to take a look!

Hi @ajay.david, I’ve granted Algolia read access for 14 days – thanks!

Hi @richard.scarrott,

Thanks for granting the access. What’s happening is a result of how facetingAfterDistinct (the counting of values) works in conjunction with distinct (the de-duplication).

The facetingAfterDistinct works only if the records with the same distinct value share the same facet values.

In your case, you can think of the clothing variants looking generally like the below example:

{"name":"t-shirt", "color":"Black", "size":"S" ,"productId":"12345"}
{"name":"t-shirt", "color":"Black", "size":"L" ,"productId":"12345"}

When facetingAfterDistinct is enabled, the facet count is computed based on the elements returned first after the distinct. In the above example, the index returns the top record first (the “better ranked” of the distinct group, based on some criteria in the index) meaning distinct=1 would return:

facets:

  • color: [“Black”]
  • size: [“S”]

This computation is why your index is returning lower facet count values than expected, in your case, when looking at the counts for size Large.

Please note that this is not a bug per se - more of a side effect of how facetingAfterDistinct works.

There is no known workaround. Using facetingWithDistinct can lead to counter-intuitive results.

Thanks for looking into this for me @ajay.david I think I get it – we would be able to safely show accurate counts for facets which do not change across the same ‘productId’, e.g. ‘brand’ and ‘category’ but options which do change across the same ‘productId’ such as ‘size’, ‘color’ would never show accurate facet counts.

It sounds like this would be a particularly hard one to solve; do you have any plans to attempt to address it? Or is there an alternative way we could structure our records – perhaps a record could represent a product (group of variants) rather than a single variant?