"How can I leverage search results to enhance data quality?"

Hello everyone! I’ve been reviewing the documentation and the analytics API/dashboard, but I’m uncertain about the approach to take.

My company is generating data (descriptions and keywords) that is fed to Algolia. We use ChatGPT and other models for this data generation, but sometimes it can be a bit misleading. I’ve been contemplating how to make Algolia indicate which records require manual refinement, but I haven’t been able to devise a method that clearly identifies ‘Product X has faulty data.’

One approach I’m considering (and I’d appreciate your suggestions!) is to track positions in search results to prioritize records in the following ways:

  1. If a record is highly ranked but rarely clicked, it may indicate that the data is not useful and is adding noise.
  2. If a record is lowly ranked but frequently clicked, it may suggest that the data needs improvement to better match user queries.

How would you recommend approaching this? My initial idea is to send every query along with its corresponding result (objectID + position) to help prioritize specific records for curation."