Updates for split records: How does Algolia handle conflicting requests?

Suppose I issue a delete_by request on my index’s distinct_key for distinct_key:1234 which currently has records with objectIDs ["A", "B", "C", "D"] and then issue a save_objects request with objectIDs ["C", "D", E"] all also having distinct_key:1234?

How does Algolia handle these two requests? My desired outcome is for objectIDs ["C", "D", E"] to be saved and for objectIDs ["A", "B"] to be deleted.

More context:

The “Automatic update for split records” section in the Wordpress “Splitting Large Records” integration article here got me curious about how algolia handles asynchronous deltes and updates on the same objectIDs.

I have a large object that is split into several records in Algolia. Updating this object requires not just sending the newly generated records, but also deleting the old ones in case the way the object is chunked into records changes.

Currently for updating these objects we do the following:

  1. Synchronously retrieve the list of existing objectIDs in Algolia corresponding to the object (our distinct_key)
  2. Generate split records w/ objectIDs for the object
  3. Issue a delete_objects request to delete existing objectIDs that don’t exist in our generated records’ objectIDs (without waiting for the task to complete)
  4. Issue a save_objects request to save the generated records (without waiting for the task to complete)

The need to request objectIDs from Algolia before the requests can be issued is a bottleneck. However the article I mentioned makes me think issuing the delete on only the objectIDs that should be deleted may not be necessary.

When updating a post, it can potentially become shorter and take fewer records. This means you need to delete old records for a given post before indexing the new ones. You can delete all records for a given post by using the deleteBy method on the distinct_key attribute.

Hi there,

The methods are asynchronous . What you are actually doing when calling these methods is adding a new job to a queue: it is this job, and not the method, that actually performs the desired action . In most cases, the job is executed within seconds if not milliseconds. But it all depends on what is in the queue: if the queue has many pending tasks, the new job will need to wait its turn.

To help manage this asynchronicity, each method returns a unique task id which you can use with the waitTask method. Using the waitTask method guarantees that the job has finished before proceeding with your new requests. You will want to use this to manage dependencies, for example, when deleting an index before creating a new index with the same name, or clearing an index before adding new objects.

For your specific use-case what you can do is:

taskID = delete(["A", "B", "C", "D"])
wait_task(taskID)
save_objects(["C", "D", E"])

Hope this helps.
Happy coding with Algolia!