I’m using docsearch to crawl our website substrate.dev . It’s configuration is here.
Many of our docs are hosted in a docusaurus installation, and they are being indexed perfectly. Some of our docs are not in docusaurus (example1 example2), and we would like them to be indexed as well.
My first attempt was https://github.com/algolia/docsearch-configs/pull/894 . When this fix didn’t work I realized I would need a better iterating cycle than attempting changes, getting the PR merged, and waiting for a crawl.
I decided to run the scraper myself, but the docker container gives me a python traceback. https://gist.github.com/JoshOrndorff/fbb37b8f8e17deee75f815d319b239da
Any help with the primary issue of crawling out entire site, or the secondary issue of running the scraper are greatly appreciated. For cross reference, we’re tracking this issue internally at https://github.com/substrate-developer-hub/substrate-developer-hub.github.io/issues/145