I’m trying out Algolia search as part of my company’s documentation migration to Docusaurus. As we don’t qualify for DocSearch, I’ve been configuring things through Algolia directly.
Before we start a subscription to Algolia to access their crawler, I wanted to establish a working test case using an index I created (I’ve tried using other crawlers and keep running into problems). I also created the “demo_media” index from the tutorial. But when I try to search either index from Docusaurus using my App ID and API key, I get no results.
If I put DocSearch’s tutorial credentials and their “docsearch” index into the docusaurus.config.js file, it all works.
I suspect the problem is simply that the indexes I’m trying to use aren’t formatted properly, but I don’t want to pay for access to the crawler and risk having that fail as well because it isn’t actually the problem.
Am I correct in thinking the index has to be formatted a specific way to return results in Docusaurus? Or might there be something else causing the problem?
DocSearch for Docusaurus is looking for metadata within your Docusaurus pages for indexing. If you don’t qualify for the DocSearch program, you can run the OSS version of the crawler against your site to generate an index with the right shape:
You can either continue to run this scraper yourself, or use that shape to build your own indexing tool.
I tried going that route but was unable to make it work. The documentation you linked is… less than clear on some steps, including how to set up the Docker image. It also asks me to rely on a Python library that doesn’t appear to be usable on anything newer than Python 3.6.
At this point, I’m hoping to find someone who can just give me an example output of a Docusaurus crawl, so I know what the formatting needs to look like.
I have tried reverse-engineering the shape from other scraper config files, but it hasn’t gotten me anywhere. Made a few tweaks to the ordering so it looks closer to what you provided, created a new index using that file, same results. Here’s a sanitized record from that output:
Part of my issue is that, while I can see from the scraper what keys it’s looking for and what keys it will add to the record, it’s less obvious what the VALUES should look like and whether that makes a difference in this case.
Making sample records available for test cases like mine would be a huge help. DocSearch providing credentials for a sample index is great for making sure the search bar UI can fetch from well-formed apps and indexes. Sample records would help verify things are working properly in the other direction.