Exclude certain element contents

Hello,

I’m having a problem with the Algolia Crawler. On my documentation website, I have a left and and right bar. The problem is that Algolia indexes these and shows the results.

Another issue is that it also shows a link to the article id, instead of just linking the main page.

I’m using the Crawler at https://crawler.algolia.com/. Here is my code:

Code
new Crawler({
  rateLimit: 8,
  maxDepth: 10,
  startUrls: ["https://rayfield.dev"],
  renderJavaScript: false,
  sitemaps: ["https://rayfield.dev/sitemap-index.xml"],
  ignoreCanonicalTo: true,
  schedule: "at 5:05 PM on Thursday",
  actions: [
    {
      indexName: "rayfield",
      pathsToMatch: ["https://rayfield.dev/**/**"],
      recordExtractor: ({ helpers }) => {
        return helpers.docsearch({
          recordProps: {
            lvl1: ["header h1", "article h1", "main h1", "h1", "head > title"],
            content: ["article p, article li", "main p, main li", "p, li"],
            lvl0: {
              selectors: "",
              defaultValue: "Documentation",
            },
            lvl2: ["article h2", "main h2", "h2"],
            lvl3: ["article h3", "main h3", "h3"],
            lvl4: ["article h4", "main h4", "h4"],
            lvl5: ["article h5", "main h5", "h5"],
            lvl6: ["article h6", "main h6", "h6"],
          },
          aggregateContent: true,
          recordVersion: "v3",
        });
      },
    },
  ],
  initialIndexSettings: {
    rayfield: {
      attributesForFaceting: ["type", "lang"],
      attributesToRetrieve: [
        "hierarchy",
        "content",
        "anchor",
        "url",
        "url_without_anchor",
        "type",
      ],
      attributesToHighlight: ["hierarchy", "content"],
      attributesToSnippet: ["content:10"],
      camelCaseAttributes: ["hierarchy", "content"],
      searchableAttributes: [
        "unordered(hierarchy.lvl0)",
        "unordered(hierarchy.lvl1)",
        "unordered(hierarchy.lvl2)",
        "unordered(hierarchy.lvl3)",
        "unordered(hierarchy.lvl4)",
        "unordered(hierarchy.lvl5)",
        "unordered(hierarchy.lvl6)",
        "content",
      ],
      distinct: true,
      attributeForDistinct: "url",
      customRanking: [
        "desc(weight.pageRank)",
        "desc(weight.level)",
        "asc(weight.position)",
      ],
      ranking: [
        "words",
        "filters",
        "typo",
        "attribute",
        "proximity",
        "exact",
        "custom",
      ],
      highlightPreTag: '<span class="algolia-docsearch-suggestion--highlight">',
      highlightPostTag: "</span>",
      minWordSizefor1Typo: 3,
      minWordSizefor2Typos: 7,
      allowTyposOnNumericTokens: false,
      minProximity: 1,
      ignorePlurals: true,
      advancedSyntax: true,
      attributeCriteriaComputedByMinProximity: true,
      removeWordsIfNoResults: "allOptional",
    },
  },
  appId: "REDACTED",
  apiKey: "REDACTED",
});```

I tried looking everywhere for a fix, the documentation website, google, etc but couldn’t find anything about my problem. Could anyone help me out?

Bumping this thread.

This did the trick for me: Record Extractor | DocSearch by Algolia