Split records to be flattened for Query


I have long records that are split via the jekyll-algolia plugin. It creates many records for each article based on the markup. Similar to the strategy outlined here: https://www.algolia.com/doc/guides/sending-and-managing-data/prepare-your-data/how-to/indexing-long-documents/?language=javascript#set-attributefordistinctattributename-and-distincttrue

When searching, using React InstantSearch - i’d expect the Snippet to return the relevant content from that specific algolia-object instance of the article. However, it seems I’m consistenly returning the first object and not the relevant one.

I have faceting for content turned on and it’s searchable. I have snippeting turned on for the content attribute, etc.

It seems I must be missing something here to get this working as intended. Please advise, it would be much appreciated!


Hi @jsw324,

Thanks for contacting Algolia. We’re happy to try and provide general guidance and need just a bit more information.

We have a template React InstantSearch app available here with the boilerplate included: https://codesandbox.io/s/github/algolia/create-instantsearch-app/tree/templates/react-instantsearch

Can you create a fork and update it with the minimum code needed to reproduce your issue, then share the updated sandbox with us? From this we’ll also be able to see your App ID and index name as well.

In addition, can you grant Algolia support access to your app?

We also encourage you to give us any further steps to help us reproduce the issue like example queries and parameters, results you are seeing, results you would like to see (in this case the snippet and objectID you expect to see), and any other information that you think may be helpful for us to gain a better understanding of the issue.

Thanks! We look forward to your reply.

Hey, Thank you for your response.

Before i make a working example let me elaborate a little further. We’ll have a bunch of objects for the same article . The first one will have content: ' ' since it’s an image inside a <p> tag and we exclude images from Algolia. For some reason when the query matches on that specific article, we’ll get the object with no content instead of the many other objects that represent the same article that have content – even if the content is a match.

Can the codesandbox be private or it has to be public?

We are also using the jekyll-algolia plugin (https://github.com/algolia/jekyll-algolia) to publish markdown articles from jekyll to algolia. The default behavior of this plugin is to split a single markdown article into multiple records, as an example markdown article:

title: Some title

<p>First paragraph</p>
<p>Second paragraph</p>
<p>Third paragraph</p>

When jekyll-algolia plugin syncs this markdown article to algolia it creates 3 records:

{ title: 'some title', html: <p>First paragraph</p>, custom_ranking: { position: 0 } } // 1st record
{ title: 'some title', html: <p>Second paragraph</p>, custom_ranking: { position: 1 } } // 2nd record
{ title: 'some title', html: <p>Third paragraph</p>, custom_ranking: { position: 2 } } // 3rd record

The issue is the first record is the item that is always displayed in the search result even if the user searched for something that matched the 2nd or 3rd record. For example, user searches Second Paragraph, the returned object is the First Paragraph record.

Is there a way to allow getting the exact ranked record based on the user search?


Disregard, this has been solved :grin: