Exclude entire collection directory within jekyll collections_dir

Hi there folks,

I am using jekyll-algolia to index my jekyll-based documentation site. Jekyll has a feature that allows you to put your collections in a separate directory, defined by “collections_dir” in your config.yml file (you can see it below in the simplified site-structure I’ve provided).

I have been unable to successfully exclude the entirety of the limited-release and general-release collections from my Algolia index.

Here is the files_to_exclude section of my config.yml file. This one lists all of the possible paths that I tried to exclude. I tried with both this entire list and only subgroups of it (different glob patterns & using only HTML or only MD extensions, etc.)

Can anyone tell what I’m doing wrong?

files_to_exclude:
- index.html
- index.md
- _general-release/.md
- _limited-release/
.md
- _general-release/.html
- _limited-release/
.html
- _maintenance-releases/.md
- _maintenance-releases/
.html
- search.html
- collections/_general-release/**
- collections/_limited-release/**
- rn/general-release/**
- rn/limited-release/**
- rn/general-release/**
- rn/limited-release/**
- collections/_maintenance-releases/**
- rn/maintenance-releases/**
- rn/maintenance-releases/**
- collections/_general-release/.md
- collections/_limited-release/
.md
- rn/general-release/.html
- rn/limited-release/
.html
- rn/general-release/.html
- rn/limited-release/
.html
- collections/_maintenance-releases/.md
- rn/maintenance-releases/
.html
- rn/maintenance-releases/.html
- _maintenance-releases/
.md
- _maintenance-releases/**.html

Versions
Jekyll v3.8.4
Ruby v2.4.4
Jekyll-Algolia v1.4.7

Site Structure

_data
_includes
_layouts
_sass
css
js
img
collections_dir
_limited-release (permalink: /rn/limited-release/(sub-folder)/filename/ )
_general-release (permalink: /rn/general-release/(sub-folder)/filename/ )
_user-help
_admin-help

Hello,

Tim here, creator of the jekyll-algolia plugin. I’ll try to help you figure this out.

From what I can understand from your site structure, I would have added ./collections_dir/_limited-release/ and ./collections_dir/_general-release/ to files_to_exclude, but it seemed you did that already and it didn’t work.

Would you have a link to a GitHub repository where I could reproduce the issue? This would make things easier to debug. If you do, could you also post an issue on https://github.com/algolia/jekyll-algolia/issues (there are a few issues piling up there, but I’ll handle them all, don’t worry).

In the meantime, a workaround to help you debug it would be to use the should_be_excluded? hook. This method will be called for each page, with the filepath to the page, and expect to return true if the page should not be indexed. Using this, you’ll see exactly what filepath your collections are generating.

Hope that helps

Hi there Tim,

Thanks so much for responding! This is the suggestion I got via email from Sarah (Algolia):

files_to_exclude:
- _general-release

That didn’t work. What did end up working for me was this:

  files_to_exclude:
   - collections/_general-release/*
   - collections/_general-release/*/*
   - collections/_general-release/*/index.md
   - collections/_limited-release/*
   - collections/_limited-release/*/*
   - collections/_limited-release/*/index.md
   - collections/_maintenance-releases/*
   - collections/_maintenance-releases/index.md

Unfortunately, this is stored in a repo on my company’s GitLab instance, and so I can’t share it with you.