Here's my index, how can I still optimize this?

Good day!

Seeking advise from the algolia experts, here’s my index, how can I optimize this? Upon checking, the single result normally has 25KB, it exceeded the soft limit of 10KB, I’m seeking advice on how can I still optimize this.

Please note that each record is a show object, and each show object has multiple episode objects, which is usually made up of 100 to 500 items. Based on the previous advise, I indexed duplicated shows with 1 episode each, then will use distinct during search. Still, each record still exceeded the soft limit. As I also read, I can separate blurbs to duplicate records, I just want to know how do I make it work if I already duplicated the show object with the episode object?

Here the index: (Shows)

{
“description”: “Scorpio Nights 1”,
“imageThumbnail”: “/images/categoryimages/864/ScorpioNights1_Thumbnail_338x468-copy.jpg”,
“type”: “movie”,
“blurb”: “A young voyeur (DANIEL FERNANDO) makes a habit of watching the couple living downstairs make love. Closely studying the sex routine of the security guard (ORESTES OJEDA) and his wife (ANA MARIE GUTIERREZ), the voyeur takes over the husband’s place one night. This leads to a dangerous and torrid affair with tragic consequences.”,
“metadata”: null,
“startDate”: “1/30/2012 12:00:00 AM”,
“endDate”: “12/31/2099 12:00:00 AM”,
“slug”: “scorpio-nights-1-2012”,
“genres”: ,
“celebrities”: ,
“episode”: {
“episodeName”: “Scorpio Nights 1 February 23 2012”,
“description”: “scorpionights”,
“synopsis”: “A young voyeur (DANIEL FERNANDO) makes a habit of watching the couple living downstairs make love. Closely studying the sex routine of the security guard (ORESTES OJEDA) and his wife (ANA MARIE GUTIERREZ), the voyeur takes over the husband’s place one night. This leads to a dangerous and torrid affair with tragic consequences.”,
“metadata”: “orestes ojeda,anna marie gutierrez,daniel fernando”,
“episodeNumber”: 0,
“onlineStartDate”: “02/23/2012”,
“mobileStartDate”: “02/23/2012”,
“onlineEndDate”: “12/31/2099”,
“mobileEndDate”: “12/31/2099”,
“imageThumbnail1”: “”,
“imageThumbnail2”: “”,
“imageThumbnail3”: “”,
“slug”: “scorpio-nights-1-february-23-2012”
},
“countryRestrictions”: [
{
“countryCode”: “MY”,
“restrictionTypeId”: 2
},
{
“countryCode”: “–”,
“restrictionTypeId”: 4
}
],
“objectID”: “21621”,
“_highlightResult”: {
“description”: {
“value”: “Scorpio Nights 1”,
“matchLevel”: “full”,
“fullyHighlighted”: false,
“matchedWords”: [
“1”
]
},
“imageThumbnail”: {
“value”: “/images/categoryimages/864/ScorpioNights1_Thumbnail_338x468-copy.jpg”,
“matchLevel”: “none”,
“matchedWords”:
},
“type”: {
“value”: “movie”,
“matchLevel”: “none”,
“matchedWords”:
},
“blurb”: {
“value”: “A young voyeur (DANIEL FERNANDO) makes a habit of watching the couple living downstairs make love. Closely studying the sex routine of the security guard (ORESTES OJEDA) and his wife (ANA MARIE GUTIERREZ), the voyeur takes over the husband’s place one night. This leads to a dangerous and torrid affair with tragic consequences.”,
“matchLevel”: “none”,
“matchedWords”:
},
“startDate”: {
“value”: “1/30/2012 12:00:00 AM”,
“matchLevel”: “full”,
“fullyHighlighted”: false,
“matchedWords”: [
“1”
]
},
“endDate”: {
“value”: “12/31/2099 12:00:00 AM”,
“matchLevel”: “full”,
“fullyHighlighted”: false,
“matchedWords”: [
“1”
]
},
“slug”: {
“value”: “scorpio-nights-1-2012”,
“matchLevel”: “full”,
“fullyHighlighted”: false,
“matchedWords”: [
“1”
]
},
“episode”: {
“episodeName”: {
“value”: “Scorpio Nights 1 February 23 2012”,
“matchLevel”: “full”,
“fullyHighlighted”: false,
“matchedWords”: [
“1”
]
},
“description”: {
“value”: “scorpionights”,
“matchLevel”: “none”,
“matchedWords”:
},
“synopsis”: {
“value”: “A young voyeur (DANIEL FERNANDO) makes a habit of watching the couple living downstairs make love. Closely studying the sex routine of the security guard (ORESTES OJEDA) and his wife (ANA MARIE GUTIERREZ), the voyeur takes over the husband’s place one night. This leads to a dangerous and torrid affair with tragic consequences.”,
“matchLevel”: “none”,
“matchedWords”:
},
“metadata”: {
“value”: “orestes ojeda,anna marie gutierrez,daniel fernando”,
“matchLevel”: “none”,
“matchedWords”:
},
“onlineStartDate”: {
“value”: “02/23/2012”,
“matchLevel”: “none”,
“matchedWords”:
},
“mobileStartDate”: {
“value”: “02/23/2012”,
“matchLevel”: “none”,
“matchedWords”:
},
“onlineEndDate”: {
“value”: “12/31/2099”,
“matchLevel”: “full”,
“fullyHighlighted”: false,
“matchedWords”: [
“1”
]
},
“mobileEndDate”: {
“value”: “12/31/2099”,
“matchLevel”: “full”,
“fullyHighlighted”: false,
“matchedWords”: [
“1”
]
},
“imageThumbnail1”: {
“value”: “”,
“matchLevel”: “none”,
“matchedWords”:
},
“imageThumbnail2”: {
“value”: “”,
“matchLevel”: “none”,
“matchedWords”:
},
“imageThumbnail3”: {
“value”: “”,
“matchLevel”: “none”,
“matchedWords”:
},
“slug”: {
“value”: “scorpio-nights-1-february-23-2012”,
“matchLevel”: “full”,
“fullyHighlighted”: false,
“matchedWords”: [
“1”
]
}
},
“countryRestrictions”: [
{
“countryCode”: {
“value”: “MY”,
“matchLevel”: “none”,
“matchedWords”:
}
},
{
“countryCode”: {
“value”: “–”,
“matchLevel”: “none”,
“matchedWords”:
}
}
]
},
“_rankingInfo”: {
“nbTypos”: 0,
“firstMatchedWord”: 2,
“proximityDistance”: 0,
“userScore”: 5676,
“geoDistance”: 0,
“geoPrecision”: 1,
“nbExactWords”: 0,
“words”: 1,
“filters”: 0
}
},
{
“description”: “Babe I Love You”,
“imageThumbnail”: “/images/categoryimages/934/BabeILoveYou_StandardThumbnail_338x468px.jpg”,
“type”: “movie”,
“blurb”: “(This movie is available until January 16, 2013) Esteemed and highly-conservative Nico Borromeo meets quirky and unconventional Sasa Sanchez who turns his ordinary life into one crazy, wonderful world.\r\nStarring Anne Curtis and Sam Milby\r\n”,
“metadata”: null,
“startDate”: “5/8/2015 12:00:00 AM”,
“endDate”: “12/31/2099 12:00:00 AM”,
“slug”: “babe-i-love-you-2015”,
“genres”: ,
“celebrities”: ,
“episode”: {
“episodeName”: “Babe I Love You May 08 2015”,
“description”: “Babe, I Love You\r\n”,
“synopsis”: “In the academe, Niccolo “Nico” Veneracion is a highly-esteemed History of Architecture professor who is on his way to becoming the next Vice Dean of the Department. He knows that achieving this would finally make his mother proud of him and for him for indirectly causing his father’s death. And yet, when he meets an unconventional girl named Sandra “Sasa” Sanchez, his world turns upside down. He never thought that he could fall in love with someone who works as a promo-girl and is obviously unacceptable in his life. Will this work? Will everything Nico rebuilt since a tragedy struck his life be threatened with Sasa in his life? Will their beliefs and principles stand a common ground for their love to prosper?”,
“metadata”: “Anne Curtis,Sam Milby,Babe I Love You”,
“episodeNumber”: 1,
“onlineStartDate”: “05/08/2015”,
“mobileStartDate”: “05/08/2015”,
“onlineEndDate”: “12/31/2099”,
“mobileEndDate”: “12/31/2099”,
“imageThumbnail1”: “/images/episodeimages/24560/BabeILoveYou-289x400.jpg”,
“imageThumbnail2”: “”,
“imageThumbnail3”: “”,
“slug”: “babe-i-love-you-may-08-2015”
},
“countryRestrictions”: ,
“objectID”: “24560”,
“_highlightResult”: {
“description”: {
“value”: “Babe I Love You”,
“matchLevel”: “none”,
“matchedWords”:
},
“imageThumbnail”: {
“value”: “/images/categoryimages/934/BabeILoveYou_StandardThumbnail_338x468px.jpg”,
“matchLevel”: “none”,
“matchedWords”:
},
“type”: {
“value”: “movie”,
“matchLevel”: “none”,
“matchedWords”:
},
“blurb”: {
“value”: “(This movie is available until January 16, 2013) Esteemed and highly-conservative Nico Borromeo meets quirky and unconventional Sasa Sanchez who turns his ordinary life into one crazy, wonderful world.\r\nStarring Anne Curtis and Sam Milby\r\n”,
“matchLevel”: “full”,
“fullyHighlighted”: false,
“matchedWords”: [
“1”
]
},
“startDate”: {
“value”: “5/8/2015 12:00:00 AM”,
“matchLevel”: “full”,
“fullyHighlighted”: false,
“matchedWords”: [
“1”
]
},
“endDate”: {
“value”: “12/31/2099 12:00:00 AM”,
“matchLevel”: “full”,
“fullyHighlighted”: false,
“matchedWords”: [
“1”
]
},
“slug”: {
“value”: “babe-i-love-you-2015”,
“matchLevel”: “none”,
“matchedWords”:
},
“episode”: {
“episodeName”: {
“value”: “Babe I Love You May 08 2015”,
“matchLevel”: “none”,
“matchedWords”:
},
“description”: {
“value”: “Babe, I Love You\r\n”,
“matchLevel”: “none”,
“matchedWords”:
},
“synopsis”: {
“value”: “In the academe, Niccolo “Nico” Veneracion is a highly-esteemed History of Architecture professor who is on his way to becoming the next Vice Dean of the Department. He knows that achieving this would finally make his mother proud of him and for him for indirectly causing his father’s death. And yet, when he meets an unconventional girl named Sandra “Sasa” Sanchez, his world turns upside down. He never thought that he could fall in love with someone who works as a promo-girl and is obviously unacceptable in his life. Will this work? Will everything Nico rebuilt since a tragedy struck his life be threatened with Sasa in his life? Will their beliefs and principles stand a common ground for their love to prosper?”,
“matchLevel”: “none”,
“matchedWords”:
},
“metadata”: {
“value”: “Anne Curtis,Sam Milby,Babe I Love You”,
“matchLevel”: “none”,
“matchedWords”:
},
“onlineStartDate”: {
“value”: “05/08/2015”,
“matchLevel”: “none”,
“matchedWords”:
},
“mobileStartDate”: {
“value”: “05/08/2015”,
“matchLevel”: “none”,
“matchedWords”:
},
“onlineEndDate”: {
“value”: “12/31/2099”,
“matchLevel”: “full”,
“fullyHighlighted”: false,
“matchedWords”: [
“1”
]
},
“mobileEndDate”: {
“value”: “12/31/2099”,
“matchLevel”: “full”,
“fullyHighlighted”: false,
“matchedWords”: [
“1”
]
},
“imageThumbnail1”: {
“value”: “/images/episodeimages/24560/BabeILoveYou-289x400.jpg”,
“matchLevel”: “none”,
“matchedWords”:
},
“imageThumbnail2”: {
“value”: “”,
“matchLevel”: “none”,
“matchedWords”:
},
“imageThumbnail3”: {
“value”: “”,
“matchLevel”: “none”,
“matchedWords”:
},
“slug”: {
“value”: “babe-i-love-you-may-08-2015”,
“matchLevel”: “none”,
“matchedWords”:
}
}
},
“_rankingInfo”: {
“nbTypos”: 0,
“firstMatchedWord”: 42,
“proximityDistance”: 0,
“userScore”: 6085,
“geoDistance”: 0,
“geoPrecision”: 1,
“nbExactWords”: 0,
“words”: 1,
“filters”: 0
}
}

Hello There,

You could split your synopsys by paragraph and add an “order” counter in order to be able to reconstruct the whole text. Also, you could think to have an index containing just the splitted synopsis and a reference to the episode metadata.

This will roughly translate to the following:

[
  {
    "episode_id": 1,
    "order": 1,
    "synopsis": "A young voyeur (DANIEL FERNANDO) makes a habit of watching the couple living downstairs make love."
  },
  {
    "episode_id": 1,
    "order": 2,
    "synopsis": "Closely studying the sex routine of the security guard (ORESTES OJEDA) and his wife (ANA MARIE GUTIERREZ), the voyeur takes over the husband’s place one night."
  },
  {
    "episode_id": 1,
    "order": 3,
    "synopsis": "This leads to a dangerous and torrid affair with tragic consequences."
  }
]

As you already said, you can use distinct to perform your search afterwards.

For a complete reference have a look here:

https://www.algolia.com/doc/guides/ranking/distinct/#distinct-to-index-large-records

Let me know if this helps you out!

Hi Gianluca,

Sorry if the sample index I provided wasn’t in the correct format.

I actually read this before and will definitely apply this but what concerned me is what if I have to do this multiple times in 1 record? Also, there is an Episode object in each record (Every record is equivalent to multiple episodes), I have to split the synopsis field. Will it be a good idea or practice if I have duplicate records for each splitted synopsis field in each episode? Also, I have to split the blurb field in the record, so in order to be efficient in size, I was thinking of having something like this

[
  {
   "description": "Ang Probinsyano",
   "blurb": "A young voyeur (DANIEL FERNANDO) makes a habit of watching the couple living downstairs make love.",
   "celebrities": null,
   "episodes": null,
   "countryRestriction": [
        {"countryCode": "PH", "restrictionType": 1},
        {"countryCode": "US", "restrictionType": 2}
    ]
  },
  {
   "description": "Ang Probinsyano",
   "blurb": "Buda-Pesth seems a wonderful place, from the glimpse which I got of it from the train and the little I could walk through the streets.",
   "celebrities": null,
   "episodes": null,
   "countryRestriction": [
        {"countryCode": "PH", "restrictionType": 1},
        {"countryCode": "US", "restrictionType": 2}
    ]
  },
  {
   "description": "Ang Probinsyano",
   "blurb": "I feared to go very far from the station, as we had arrived late and would start as near the correct time as possible.",
   "celebrities": null,
   "episodes": null,
   "countryRestriction": [
        {"countryCode": "PH", "restrictionType": 1},
        {"countryCode": "US", "restrictionType": 2}
    ]
  },
  {
   "description": "Ang Probinsyano",
   "blurb": null,
   "celebrities": {
          "celebrityName": "Charo Santos",
          "description": "Charo Santos, who is best known these days as a formidable leader in Philippine Media, has roots that have long been ingrained in the field of entertainment."
        },
   "episodes": null,
   "countryRestriction": [
        {"countryCode": "PH", "restrictionType": 1},
        {"countryCode": "US", "restrictionType": 2}
    ]
  },
  {
   "description": "Ang Probinsyano",
   "blurb": null,
   "celebrities": {
          "celebrityName": "Charo Santos",
          "description": "Her history in ABS-CBN, which culminated to her post as the network’s President and Chief Operating Officer, started in the late 70s when she worked here as a production assistant."
        },
   "episodes": null,
   "countryRestriction": [
        {"countryCode": "PH", "restrictionType": 1},
        {"countryCode": "US", "restrictionType": 2}
    ]
  },
  {
   "description": "Ang Probinsyano",
   "blurb": null,
   "celebrities": null,
   "episodes": {
       "episodeName": "Ang Probinsyano August 3 2017",
       "synopsis": "The peace and order of a small town in the Cordilleras are threatened with the coming of a rich and powerful mining baron."
    },
   "countryRestriction": [
        {"countryCode": "PH", "restrictionType": 1},
        {"countryCode": "US", "restrictionType": 2}
    ]
  },
  {
   "description": "Ang Probinsyano",
   "blurb": null,
   "celebrities": null,
   "episodes": {
       "episodeName": "Ang Probinsyano August 3 2017",
       "synopsis": "With his army of goons and and thugs, no one can stand in his way as he opens up a mine and desecrates the sacred ancestral land of the native Maranggani tribe."
    },
   "countryRestriction": [
        {"countryCode": "PH", "restrictionType": 1},
        {"countryCode": "US", "restrictionType": 2}
    ]
  }
]

If you would notice, these are an array of each records, I have the fields candidate for being splitted blurb, celebrities object, and episode objects. Notice that when the blurb field is present, the celebrities and episodes objects are null, whereas when the episodes object has value, the blurb and celebrities object are null, the same goes for both blurb and episodes object when the celebrities object has value.

Will this be a good approach in order to be efficient in size? Also, if you would notice, I have the countryRestriction populated in every record, some records are not to be shown in each country so I’ll have to filter it out using conjunction and disjunction (AND, OR) properties.

The question is, will this be the best approach I can do or can you recommend a better one? Thanks!

Hi,

I think you’re on the right direction: you should already see improvements in the object size by duplicating records since Algolia is really fast in indexing and serving small chunks of text.

That said, the structure of your data also depends on which informations you want to provide and the importance of it. In my opinion, you could handle this by having an index for shows (splitted by blurb), episodes and celebrities:

Show Index

[
  {
    "objectID": "ang-probinsyano",
    "description": "Ang Probinsyano",
    "blurb": "A young voyeur (DANIEL FERNANDO) makes a habit of watching the couple living downstairs make love.",
    "order": 1,
    "countryRestriction": [
      {
        "countryCode": "PH",
        "restrictionType": 1
      },
      {
        "countryCode": "US",
        "restrictionType": 2
      }
    ]
  },
  {
    "description": "Ang Probinsyano",
    "blurb": "Buda-Pesth seems a wonderful place, from the glimpse which I got of it from the train and the little I could walk through the streets.",
    "order": 2,
    "countryRestriction": [
      {
        "countryCode": "PH",
        "restrictionType": 1
      },
      {
        "countryCode": "US",
        "restrictionType": 2
      }
    ]
  },
  ...
]

Episode Index

[
  {
    "show_id": "ang-probinsyano",
    "order": 1,
    "synopsis": "A young voyeur (DANIEL FERNANDO) makes a habit of watching the couple living downstairs make love."
  },
  {
    "show_id": "ang-probinsyano",
    "order": 2,
    "synopsis": "Closely studying the sex routine of the security guard (ORESTES OJEDA) and his wife (ANA MARIE GUTIERREZ), the voyeur takes over the husband’s place one night."
  },
]

[
  {
    "appears_in": ["ang-probinsyano"],
    "celebrityName": "Charo Santos",
    "description": "Charo Santos, who is best known these days as a formidable leader in Philippine Media, has roots that have long been ingrained in the field of entertainment."
  },
  ...
]

Celebrity Index

[
  {
    "appears_in": ["ang-probinsyano"],
    "celebrityName": "Charo Santos",
    "description": "Charo Santos, who is best known these days as a formidable leader in Philippine Media, has roots that have long been ingrained in the field of entertainment."
  },
  ...
]

This structure will keep the object size low but, as a trade-off, you will have to use multi-queries to send the same query to three indices:

var client = algoliasearch("appId", 'apiKey');
var query = 'couple';

var queries = [{
  indexName: 'episodes',
  query: query
}, {
  indexName: 'celebrities',
  query: query
}, {
  indexName: 'shows',
  query: query
}];

client.search(queries, function(err, results) { console.log(results); });

This will be counted as 3 queries, so it will have an impact on your operation quota; that said, it’s usually best to find a trade-off between duplication (number of records) and number of operations.

You can find more information about multiple queries in here: https://www.algolia.com/doc/api-client/javascript/search/#search-multiple-indices

Let me know what you think about it.

Cheers!

Hi Gianluca,

Thank you for this, a few steps from completing the structure of my index. I did the script to index the records based on my second comment. Just my concern, will it be faster to use multi-query rather than the approach I initially did? If there will be a huge difference in performance, I suppose we can consider compromising the operation quota.

Also, I suppose both will have nearly the same number of records as celebrities will appear on multiple shows, so we have to duplicate it as well.

Thank you very much for the usual support!

Hi,

Glad you’re coming close to completing your structure :raised_hands:

For the performance question it’s not easy to predict what will be the best for your use-case: on one side you have a single index with lots of items (but potentially lots of duplicates / records), on the other 2-3 indices with fairly small records.

If you manage to properly split duplicates under 10KB in one single index you’ll have the best performance at indexing and query time, so I’d suggest you to try that before moving to splitting data into multiple indices.

If your text attributes size is unpredictable then moving to multi-queries will help you in keeping your data structures small and optimized for Algolia, but your query time performance will be impacted since you’ll have to query N indices and wait for the API to aggregate the results.

Hope this helps you out!

1 Like