Hello,
I’m having a problem with the Algolia Crawler. On my documentation website, I have a left and and right bar. The problem is that Algolia indexes these and shows the results.
Another issue is that it also shows a link to the article id, instead of just linking the main page.
I’m using the Crawler at https://crawler.algolia.com/. Here is my code:
Code
new Crawler({
rateLimit: 8,
maxDepth: 10,
startUrls: ["https://rayfield.dev"],
renderJavaScript: false,
sitemaps: ["https://rayfield.dev/sitemap-index.xml"],
ignoreCanonicalTo: true,
schedule: "at 5:05 PM on Thursday",
actions: [
{
indexName: "rayfield",
pathsToMatch: ["https://rayfield.dev/**/**"],
recordExtractor: ({ helpers }) => {
return helpers.docsearch({
recordProps: {
lvl1: ["header h1", "article h1", "main h1", "h1", "head > title"],
content: ["article p, article li", "main p, main li", "p, li"],
lvl0: {
selectors: "",
defaultValue: "Documentation",
},
lvl2: ["article h2", "main h2", "h2"],
lvl3: ["article h3", "main h3", "h3"],
lvl4: ["article h4", "main h4", "h4"],
lvl5: ["article h5", "main h5", "h5"],
lvl6: ["article h6", "main h6", "h6"],
},
aggregateContent: true,
recordVersion: "v3",
});
},
},
],
initialIndexSettings: {
rayfield: {
attributesForFaceting: ["type", "lang"],
attributesToRetrieve: [
"hierarchy",
"content",
"anchor",
"url",
"url_without_anchor",
"type",
],
attributesToHighlight: ["hierarchy", "content"],
attributesToSnippet: ["content:10"],
camelCaseAttributes: ["hierarchy", "content"],
searchableAttributes: [
"unordered(hierarchy.lvl0)",
"unordered(hierarchy.lvl1)",
"unordered(hierarchy.lvl2)",
"unordered(hierarchy.lvl3)",
"unordered(hierarchy.lvl4)",
"unordered(hierarchy.lvl5)",
"unordered(hierarchy.lvl6)",
"content",
],
distinct: true,
attributeForDistinct: "url",
customRanking: [
"desc(weight.pageRank)",
"desc(weight.level)",
"asc(weight.position)",
],
ranking: [
"words",
"filters",
"typo",
"attribute",
"proximity",
"exact",
"custom",
],
highlightPreTag: '<span class="algolia-docsearch-suggestion--highlight">',
highlightPostTag: "</span>",
minWordSizefor1Typo: 3,
minWordSizefor2Typos: 7,
allowTyposOnNumericTokens: false,
minProximity: 1,
ignorePlurals: true,
advancedSyntax: true,
attributeCriteriaComputedByMinProximity: true,
removeWordsIfNoResults: "allOptional",
},
},
appId: "REDACTED",
apiKey: "REDACTED",
});```
I tried looking everywhere for a fix, the documentation website, google, etc but couldn’t find anything about my problem. Could anyone help me out?