Recently, I noticed that a particular term (
gflag) did not appear in my search results for the [technical documentation for YugaByte DB – nothing was returned. When I did a Google search of the docs using
site:docs.yugabyte.com "gflag", the results included about 40 results. Google found the term even though originally encoded in Markdown as code blocks using backticks, either within single backticks for inline code (for example,
gflag) or within a multiple-line code block with three backticks before and after the code. Here’s an example of a “fenced code block”:
All of the content within this code block is treated as monospace.
A search for
restrictHighlightAndSnippetArrays in the Algolia API Reference finds only one result, and fails to return the API Parameters page, which has that term inside of a
Based on my reading of issues and comments by others:
- Code tag (
<code>) content is not indexed by default.
- @Sylvain.PACE has told users that Algolia does not recommend to index code since it will introduce a lot of noise." For example, see: https://github.com/algolia/docsearch-configs/pull/491#issuecomment-404233169.
- Multi-line code blocks are most often generated to use
<pre>tags and may be indexed, unless it has the
<code>tag in it too (I found this in one example).
- Document conventions for most technical documentation say that the “monospace” format is used to “indicate commands, URLs, code in examples, or text that appears on the screen.” (from Oracle)
- With the popularity of Markdown, the easiest way to generate “monospace” text is to use the backticks. And, because inline backticks result in the text to be wrapped in
<code>tags, all too often important terms and phrases are passed over by the Algolia indexing and ignored.
- As technical writers know, important functions, parameters, options, etc. are included in text, headings, tables, and lists.
I noticed that Algolia documentation has introduced a code snippet convention that doesn’t generate the
<code> tags and thus these terms are indexed unless intentionally excluded. But, as the example from the Algolia API Reference above shows, it can be difficult for users to find a term when it is encoded with
<code> tags and ignored.
I disagree that code and inline functions, parameters, etc.) introduce a lot of “noise,” especially in software documentation. The failure to index important terms explains why users are often frustrated when searching using DocSearch-indexed documentation searches. Google doesn’t ignore these terms by default — and they deal with a lot more noise than typical Algolia users do.
The docs are weak at explaining what happens to code and other “monospace” terms created using Markdown backticks. Can anyone offer some good advice about how to enable and better manage search within code blocks or inline code?