We are starting the effort of syncing our data to Algolia, and I need to understand how a specific query would work.
Our previous data structure had customer documents that contained an array of addresses. Some of these documents address arrays had thousands of entries, which made the overall size of the customer document too large for Algolia. We are considering how to restructure our data so that we can fulfil our edge case of generalized searches across the metadata attributes of a customer document, while including address metadata found in these potentially large customer address arrays.
Our initial approach was to split the addresses out into their own index and then attempt a multi-index search across the customer and addresses. However, we need the results of that search to return only customers (just getting back an address document from the address index in the results would not be useful)
While I can find plenty of references in the documentation that describe parent-child queries or multi-index queries, but neither seems to be a fit for our edge case.
I know Algolia is not a relational database, so perhaps the solution is use a single index by flattening the array of addresses currently contained in our customer documents and create one record for each address in those arrays that also contains all the metadata found in the parent customer document. This would create a large amount of documents that are identical except for the address attributes, but perhaps that is the preferred methodology? The draw backs to this approach is that we would have to do quite a bit re-architecture to our codebase decompose the results back into the objects we expect.
My preference would be to maintain our current structure where all the metadata for a customer is found on a single document (they would be the parent), and we create a child document for each address that belongs to that customer. A query that encapsulated both indices would need results from the child addresses index to return their parents in the customer’s index if they are the most relevant result, preferably without additional fetch/queries.
Would someone be willing to point me in the right direction?