How to ignore urls from search query?

If thera are 2 searchable documents. One containing a url such as:

1:

Lorem ipsum dolor, https://www.facebook.com, elit wisi imperdiet integer, vitae nam lobortis.

2:

Facebook lorem ipsum dolor, elit wisi imperdiet integer, vitae nam lobortis.

And one searches for “facebook”.
How to make the rules so only document 2 is returned (because first one has facebook in the url)?

Hi @ddl449,

Thanks for contacting Algolia support and for helping the community by posting your question to discourse!

I want to ask some clarifying questions to make sure that we can answer your question as best we can.

When you say “searchable documents”, are you using our docSearch?
If not docSearch, would you be able to clarify the structure of your record? Is it something like this:

{ title: "lorem ipsum dolor", url: "https://www.facebook.com", body1: "elit wisi imperdiet integer", body2: "vitae nam lobortis" }

We look forward to hearing back!

Hi @ajay.david,

Thanks for the reply!

By documents I mean records in a index, sorry for that.

{ title: "title1", description: "https://www.facebook.com lorem ipsum dolor."}
{ title: "title2", description: "Facebook lorem ipsum dolor."}

Searchable attribute is only description. If I search “facebook” we will have 2 hits by default. Is it possible to identify that facebook inside record no1 is part of a url so it’s not returned?

In other words: I search for “facebook” and only record no2 is returned.

Cheers,
Dio

Hi @ddl449,

Thanks for the additional information on your record structure.

There is no way to automatically exclude a URL from being searched if it is part of the text of a searchableAttributes - in this case your description.

Based on the sample records, to exclude record 2 from a search for “facebook” you would have to do some work to restructure your data or add a query rule.

Restructure Data: One approach would be to duplicate the description field and call it description_without_url. Then you can set description_without_url as a searchableAttributes and the original description as the attributesToHighlight.

Query Rule: This one assumes you know every word of what you want to exclude. You can use a query rule to add a filter removing the objectID of the specific record to exclude, if the query contains facebook.

I do want to note that without implementing anything extra, the standard Algolia relevance setup would list record 2 before (more relevant) record 1.

I hope this helps! Happy coding.