Wordpress concatenation without spaces & icon names showing

The WP plugin is concatenating values from an element into one long string without spaces
between different items in the wp_posts_page index content attribute:
eg ContactpersonTom Heademailtom123@lab.co.uklanguage http://lab.co.ukhome2
whereas for fulltext search we need this to be indexed:
Contact person Tom Head email tom123@lab.co.uk language http://lab.co.ukhome2
(this is using the IconBox element but other element types affected too)

How to get the needed spaces in there? raise an issue on github?

Other problem, Icon names indexed in the content attribute:
chevron_right is from this HTML
with the commonly used font-family: ‘Material Icons’ which transforms that text into >
Anyone come across this and can you suggest a fix to remove these icon names?

Thanks for sharing your issue,

I’m unable to guess what your issue could be. What are the different concatenated fields you are referring too?
Have you done custom styling?

Thanks for the prompt reply Ray.
The controls in my example are from this common Zephyr theme: http://zephyr.us-themes.com/elements/iconbox/
with shortcodes like this:

[us_iconbox icon=“material|person” size=“22px” iconpos=“left” alignment=“left”]Tom Head[/us_iconbox][us_iconbox icon=“material|email” size=“22px” iconpos=“left” title="tom123@lab.co.uk" title_tag=“div” alignment=“left”][/us_iconbox][us_iconbox icon=“material|language” size=“22px” iconpos=“left” title=“http://lab.co.uk” title_tag=“p” link=“url:http%3A%2F%2Flab.co.uk||target:%20_blank|” alignment=“left”][/us_iconbox]

which generates HTML with separate divs (see bottom contact details on the page here https://neuromarketingtips.eu/neuromarketing-resources/neuromarketing-companies/lab )

… the text content of those controls and some others are are concatenated without spaces when saved to index. Is it an idea to manually add some spacing element(please recommend) between each?
There is no page-level custom CSS and none on the individual controls, it’s all from the theme.

For issue of the icon name being indexed I guess the best solution would be to ignore these when indexing, but how?

This is the function that is used in order to remove “noise” from the content.

I suppose that is the part messing with the spaces.

Maybe you could try manually adding some spaces, for example adding  ?

The noise removal was not the culprit, rather a faulty php function strip_tags… this worked:
(in class Algolia_Utils)

	//REPLACE THIS LINE WITH CODE BELOW return strip_tags( $content );
	// ----- remove HTML TAGs ----- 
	$content = preg_replace ('/<[^>]*>/', ' ', $content); 

	// ----- remove control characters ----- 
	$content = str_replace("\r", '', $content);    // --- replace with empty space
	$content = str_replace("\n", ' ', $content);   // --- replace with space
	$content = str_replace("\t", ' ', $content);   // --- replace with space

	// ----- remove multiple spaces ----- 
	$content = trim(preg_replace('/ {2,}/', ' ', $content));

	return $content; 

Would you like me to add a pull request to the repo? or you prefer to make this code more elegant and do it yourself?

Hi there,

Thanks for investigating this.

I think we could maybe incorporate this into the codebase.
If you are feeling to do a PR I’d be happy to review it.

I’d like as much as possible to not impact existing users though. We need to make sure the change isn’t to impactful.