Tutorial: Indexing PDF Or Other File Contents For Searching

I made a step-by-step tutorial for using Tika to split a PDF into paragraphs, parsing the resulting HTML with Nokogiri, and indexing it in Algolia:



Thanks for posting @omar! There’s a lot of potential for tackling different content types with Tika, if anyone has some experience using it please chime in.