Crawler skipping my website for reason "domain_not_allowed"

My project has just joined DocSearch (thanks Algolia!). Note that I have not changed any settings from those set up by default when the account was created by the DocSearch admins.

The crawler is unable to crawl my website, it reports the 1 ignored website and no records found. When I run the URL tester is reports This page had a canonical URL (, which was skipped for this reason: "domain_not_allowed" as shown in the screenshot below:

My searching suggested that this means I need to verify my domain. At it shows that my domain is verified and that I do not have the authority to verify domains (presumably because I am a free DocSearch user):

I did go ahead and follow the linked How to verify domain and added a robots.txt page to my website, but that had no affect on the crawler error.

A coworker took a look and figured out the issue.

The key observation is that is not the same thing as The later is the real link, but my Docusaurus configuration was using which resulted in the HTML element:

<link data-rh="true" rel="canonical" href="">

Thus the crawler would visit https://, be told that the canonical URL is, and then error because only the version including www is verified.

Easy enough fix to my Docusaurus configuration! I hope this helps someone else in the future.