Data Collection of Users

As we’re implementing Algolia into active client websites and due to strict regulations in Germany (and the rest of the EU), I’d like to know which information is gathered by the users actually using instantsearch. So not my data as an admin or developer with access to the dashboard. But actual visitors to a website who enter a search term or use any filters with instantsearch.

I could not find this information in the privacy policy. It only mentions my data – but the user is not visiting the Algolia website.

Any feedback or insights are much appreciated.

Hi Andreas,

When a user queries your index using InstantSearch, they can retrieve anything from this index. This includes:

  • Your application ID, used on the front end, visible in code and network requests
  • Your search API key, used on the front end, visible in code and network requests
  • The index name, used on the front end, visible in code and network requests
  • The index content (records) returned from the API, visible in network requests

So, for example, let’s say you’re indexing users of a forum, with personal information such as their IP address. If you provide a search experience for users to search through other users, they can access the IP address, even if you don’t display it on the UI. This is why it’s crucial for you to ensure that anything you index in Algolia isn’t sensitive or confidential (or you need to use unretrievableAttributes, see below).

Note that even when an attribute in your records isn’t searchable, it still appears in the API response. If you want to use attributes but make them unretrievable (e.g., have an attribute total_number_of_sales for ranking purposes but hide it from the API response), you can use unretrievableAttributes at indexing time.

As a rule of thumb, unless a sensitive piece of information is necessary for search (as in the example above), you shouldn’t index it.

For more best practices, you can head over to our documentation:

Hi Sarah,

thanks for the quick response. But that was actually not what I meant. :slight_smile:

Let’s assume I visit a website that implements Algolia Instantsearch. I type in a search request and also use some other search filters. Algolia uses my somewhat personal data (ie. not what I search for) for analytics and other stuff, right? Like my IP address to know where I am located.

Strictly speaking, this is a GDPR question. Basically, we need to tell our users what kind of information Algolia collects about our users who use Instantsearch.

Hi Andreas, and sorry for not understanding your initial question.

When a user makes a search request, we log two things:

  • IP address
  • Whole search request (including query, parameters, headers, etc.)

We process this data internally for services like monitoring or analytics, as well as geo search if you implement it. If you have analytics turned off, we still keep the raw data, but won’t send it to the Analytics API. All logs are deleted after 90 days. When you delete your account, you can expect all your query logs to disappear after that delay.

Source: How to delete my user’s data?

When you use Click and Conversion Analytics, you need to provide a userToken to uniquely identify users. If you don’t provide it, we fall back on the IP address (but we strongly recommend using a custom userToken instead). This personal information is therefore collected by Algolia, since you’re sending it.

Does this help?

Hello Sarah, that answered my question, yes. Thank you very much. :slight_smile:

1 Like