we write our documentation in DITA topics (one adoc file per concept, reference or task). Each topic has meta tags for description and keywords, which are by default written as meta keywords/description into <head> when converted to html. However, those topics are ‘bundled’ for web view, because one ‘.html’ per topic would be too fine-grained and result in an endless amount of clicks for the user.
So with the process of ‘bundling’, the meta tags keywords & description of all but the first topic first disappear. Their <h1> titles become <h2> titles, and below the <h2> we bring back the keywords as plain text, wrapped in the .keywords class with display:none; and the same for the description text, in a .description class with display:none.
As an example, the structure of a topic-bundle may look like this (closing tags per line omitted):
<h1> Concept 1
<p> Preamble text
<h2 id=“Reference1”> Reference 1
<p class=“keywords”> Keyword1, Keyword2, Keyword3
<p class=“description”> Description Text of Reference 1
<p> Reference 1 content
<h2 id=“Task1”> Task 1
<p class=“keywords”> Keyword4, Keyword5, Keyword6
<p class=“description”> Description Text of the Task 1
<h2 id=“Task2”> Task 2
<p class=“keywords”> Keyword7, Keyword8, Keyword9
<p class=“description”> Description Text of the Task 2
<h2 id=“ConceptB”> Concept B
<p class=“keywords”> Keyword10, Keyword11, Keyword12
<p class=“description”> Description Text of Concept B
I am running the docsearch scraper locally and assume that the best method for indexing these extra sets of keywords is adding them to the search configuration:
"lvl0": ".doc h1", "lvl1": ".doc h2", "lvl2": ".doc .keywords", "lvl3": ".doc .description", "lvl4": ".doc h3", "lvl5": ".doc h4",
However, I am not sure how to ‘combine’ lvl1 with lvl2 and lvl3. As an example, let’s assume that keyword4 also appears in the h2-Title of Task 1. Keywords are often part of the title, but also include synonyms. If the user searches for “keyword4”, Algolia will give a result of lvl1, and lvl2 and lvl3 will return ‘null’. So far so good, the user can guess from the title if the result matches his intent.
But to improve the result, what we actually want to show the user when he searches for keyword4 (or a synonym) is: the heading h2 (with anchor, lvl1) and the description (lvl3) of the topic. Not the matched keyword.
And therefore I would be very thankful for a hint of how to tell the scraper to index the additional meta information per anchor as part of the search result, and how to display the information correspondingly in docsearch.js or instantsearch.js. It could work that I swap lvl1 and lvl3, ie. place description and keywords before the anchor h2. But then a click on the search result would not point to the correct anchor. On the other hand, I could omit the entire description and pre-process the search results somehow. I found transformdata - is there a way I could use it to scrape the lvl3 description class after the lvl1 search result?