Duplicate records in Algolia for Update

Hi,

I followed the guide in the link https://www.algolia.com/doc/guides/ranking/distinct/#distinct-to-index-large-records.

So having split the large field such as blurb/synopsis, I had them duplicated in the index record. So what happened was:

[{
   showId: 1,
   title: 'Ang Probinsyano',
   blurb: 'A young voyeur (DANIEL FERNANDO) makes a habit of watching the couple living downstairs make love.',
   objectId: 1000000
 },
 {
   showId: 1,
   title: 'Ang Probinsyano',
   blurb: 'Closely studying the sex routine of the security guard (ORESTES OJEDA) and his wife (ANA MARIE GUTIERREZ), the voyeur takes over the husband’s place one night.',
   objectID: 1000001
 }]

What happened was, I no longer able to declare my own object ID, it was created its own and I’m not able to retrieve it, so I’m having trouble for performing update once for example, when that blurb is changed in the back-end.

Another question is, regarding the link provided above, is the 10KB soft limit applicable to EACH record or EACH object INSIDE a record?

Looking forward to your usual assistance! Thanks!

Hi @markdaniel_tamayo, you’re almost there!

ObjectIDs are attributed automatically by the engine, you don’t need to declare your own!
As the guide you linked shows in its example, you just need to give them a common attribute to filter on, like book_id in the example records. If you want to retrieve them in order, you can add an order attribute like show in the example: this will let you retrieve all records for book_id=42 sorted by ascending order, after which you can perform the update.

Regarding your other question, the 10KB limit is on every record. This limit is a tradeoff to ensure the performance of your search interface :slight_smile:

Let me know if you have more questions!

Hi, @pln

Thank you for the response!

Can you enlighten me as to how do I partially update a record without me knowing the object ID? Also, as per the example given in the tutorial, does it mean splitting the blurbs will result in multiple/redundant algolia records? So basically, my supposed objectID, which is the showId, will be used for update appear in multiple records.

Also currently I decided to use the showId as objectId, so I only have 1 record for each show, what I did in the blurb is an array of string. My question is, will splitting each blurb as an array for the same record will improve my search performance? If not, I may have to split this into multiple records and proceed with my first scenario.

Thank you very much!

Hey, you’re welcome!

How to partially update records

I would do it in two passes:

  • First, get the list of records for your objects (e.g. searching with book_id=42 to get all records related to that book)
  • Then, use partialUpdateObjects with each record’s objectID to update each one

Would splitting blurb improve search performance

As long as the record size stays almost the same, it won’t change the performance of the engine to have the blurb as a single string or as an array.

Hi, @pln !

Thank you very much once again for the response!

Also, I guess I have one last concern regarding this. If I were to separate my records for each blurb, for example, the blurb field was updated, and the number of records from the duplicate blurbs no longer equal the previous one, will it be a better idea to delete those with sample book_id=42 and insert the new record, or is there a better way to do it?

Thanks! I really appreciated your help!

Hi @markdaniel_tamayo, you’re welcome :slight_smile:

If you had 5 blurbs and now have 10, you have a choice between:

  • Deleting the former, then creating the latter: two simple api calls, 15 indexing operations
  • Updating the first 5, then creating the remaining 5: two api calls with some processing, 10 indexing operations

If you care more about the development time than the number of operations, it will be simpler for you to delete the old ones and recreate the blurbs altogether.

On the other hand, it might be worth the extra coding effort to save some indexing operations by updating the existing records instead of deleting and recreating them :wink:

Hi @pln!

Thanks! Just a follow-up question for your previous reply.

Will updating the first 5, and creating the remaining 5 work if as I said, all the blurbs are contained in one single book_id and all the records have unique objectID? From what I’m thinking now, I have to get first all the objectID’s based on the book_id, then formulate the logic from those? Though both will be applicable for the scenarios you’d given. Or is there a better way to do it?

Thank you Paul!

You’re welcome!

First, let me clear a misunderstanding: the objectID is an unique identifier of each record. You should never have the same objectID on several records (I think the engine would anyway prevent you to create a duplicated object, as creating a record with an existing objectID would overwrite the existing record). Feel free to read more about the objectID in our doc.

With that in mind, you cannot have all the records with an unique objectID. You can however give them a common identifier to find all the records holding a set of blurbs (the book_id in the example).

To update the blurbs for a given book, you would need to search with book_id=42 and a high enough value of hitsPerPage, which will return all records for the book 42.
Once you have this list, you would update the existing records (using their respective objectID) with their new blurb value; delete the remaining records if you have less new blurbs than old ones; or create the missing records if you have more new blurbs to add!

Hi @pln!

Thank you very much for this clarification, it is now clear how I’ll formulate my logic on updating the values of specific records in Algolia.

Also for the record, I may have not conveyed what I was referring to in my previous reply, I do not mean to duplicate the objectID’s in my several records, what I meant to say was select all objectID’s based on a specific book_id. However you still greatly explain what I should do in my index.

Thank you very much Paul! :raised_hands:

1 Like

Hi @markdaniel_tamayo, indeed I may have misunderstood this part!

Regardless, you’re welcome and let me know if you have other questions :slightly_smiling_face: