Partially indexing with docsearch-scraper?

We are using the docsearch docker container for crawling and indexing our websites with algolia. The version array under ‘variables’ inside the config changes frequently. The version variables refer to different endpoints on our website.

If we just add one new version to the array, it results in scraping all the versions again.
This behaviour is too costful for us, because it executes around 100.00K Operations.

Is there an option to just partially scrape our website and keep the indices which were scraped before?

We tried by just using one version variable inside the array but that results in deleting the indices for the versions which were scraped before.

The config:

1 {
  2     "index_name": "ourproduct_documentation",
  3     "start_urls": [
  4         {
  5             "url": "https://ourwebsite.com/(?P<version>.*?)/",
  6             "variables": {
  7                 "version": [
  8                     "0.10.0",
  9                     "0.11.0",
 10                     "0.12.0",
 11                     "latest",
 12                     "0.9",
 13                     "0.9.1"
 14                 ]
 15             }
 16         }
 17     ],
 18     "sitemap_urls": [
 19         "https://ourwebsite.com/latest/sitemap.xml"
 20     ],
 21     "stop_urls": [
 22         "/_"
 23     ],
 24     "selectors": {
 25         "lvl0": ".section h1",
 26         "lvl1": ".section h2",
 27         "lvl2": ".section h3",
 28         "lvl3": ".section h4",
 29         "lvl4": ".section h5",
 30         "lvl5": ".section dt code",
 31         "text": ".document p, .document li, .section dt"
 32     },
 33     "custom_settings": {
 34         "separatorsToIndex": "_"
 35     },
 36     "nb_hits": 51
 37 }

Hi Julian,

you could create one index for your older versions and another one for the latest.