How to prepare hierarchical data structure in mysql for algolia?

Good morning ~

I’ve tried several approaches to storing hierarchical data on the mysql/laravel end in preparation for Algolia.

The first was:

INSERT INTO master_formats (lvl_1, lvl_2, lvl_3, lvl_4, lvl_5)
VALUES ('03 Concrete', NULL, NULL, NULL, NULL);

INSERT INTO master_formats (lvl_1, lvl_2, lvl_3, lvl_4, lvl_5)
VALUES (NULL, 'Sections', NULL, NULL, NULL);

INSERT INTO master_formats (lvl_1, lvl_2, lvl_3, lvl_4, lvl_5)
VALUES (NULL, NULL, '03 01 00 Maintenance of Concrete', NULL, NULL);

INSERT INTO master_formats (lvl_1, lvl_2, lvl_3, lvl_4, lvl_5)
VALUES (NULL, NULL, NULL, '03 01 30 Maintenance of Cast-in-Place Concrete', NULL);

INSERT INTO master_formats (lvl_1, lvl_2, lvl_3, lvl_4, lvl_5)
VALUES (NULL, NULL, NULL, NULL, '03 01 30.51 Cleaning of Cast-in-Place Concrete');

As you can see above, the data cascades to the right in each row. When looping and populating html selects I’m trying to make a dropdown list dependent on the previous category. (another project)

        @foreach($masterFormats as $masterFormat)
                @if ($masterFormat->lvl_1 !=  NULL)
                    <option value="{{ $masterFormat->lvl_1 }}">{{ $masterFormat->lvl_1 }}</option>
                @continue

            @endif
        @endforeach

The above strips NULL values, but does not carry the upstream relationship lvl.

At any rate, the base problem is how to store this data? Users on this app will select the categories their products belong to which will save in a record, then using scout will send the searchable array to algolia.

The end goal is the hierarchical vue instantsearch widget.

INSERT INTO master_formats (lvl_1, lvl_2, lvl_3, lvl_4, lvl_5)
VALUES ('03 Concrete', NULL, NULL, NULL, NULL);
INSERT INTO master_formats (lvl_1, lvl_2, lvl_3, lvl_4, lvl_5)
VALUES ('03 Concrete', 'Sections', NULL, NULL, NULL);
INSERT INTO master_formats (lvl_1, lvl_2, lvl_3, lvl_4, lvl_5)
VALUES ('03 Concrete', 'Sections', '03 01 00 Maintenance of Concrete', NULL, NULL);
INSERT INTO master_formats (lvl_1, lvl_2, lvl_3, lvl_4, lvl_5)
VALUES ('03 Concrete', 'Sections', '03 01 00 Maintenance of Concrete', '03 01 10 Maintenance of Concrete Forming and Accessories', NULL);
INSERT INTO master_formats (lvl_1, lvl_2, lvl_3, lvl_4, lvl_5)
VALUES ('03 Concrete', 'Sections', '03 01 00 Maintenance of Concrete', '03 01 30 Maintenance of Cast-in-Place Concrete', NULL);
INSERT INTO master_formats (lvl_1, lvl_2, lvl_3, lvl_4, lvl_5)
VALUES ('03 Concrete', 'Sections', '03 01 00 Maintenance of Concrete', '03 01 30 Maintenance of Cast-in-Place Concrete', '03 01 30.51 Cleaning of Cast-in-Place Concrete');

Using a loop structure, this populates the html select with '03 Concrete' as many times as there are records.

Does algolia require on each record knowledge of the previous lvl? e.g. lv_5 should come with the relationship of lvl_4 and that of lvl_3 and so on…? (like the last line of code above)

20	02 Existing Conditions     Sections - 02	NULL	                                                NULL	
21	NULL	                   Sections - 02	NULL	                                                NULL                        
22	NULL	                   Sections - 02	02 05 00 Common Work Results for Existing Conditions	02 05 19 Geosynthetics for Existing Conditions	

Have been playing around some more here. On 21 there is no lvl_3 value. So the html select cannot populat 22 at 02 05 00 Common Work Results for Existing Conditions

On Algolia’s end is one upstream level enough for using the hierarchical menu widget? Or does it need full lvl values like so:

20	02 Existing Conditions     Sections - 02	NULL	                                                NULL	
21	02 Existing Conditions	   Sections - 02	02 05 00 Common Work Results for Existing Conditions	NULL                        
22	02 Existing Conditions	   Sections - 02	02 05 00 Common Work Results for Existing Conditions	02 05 19 Geosynthetics for Existing Conditions

Hi!

I’m not sure to fully grasp your question.
If your question is: do all records need 5 levels of hierarchy, the answer is no.

You can find more information on Hierarchical Facets on our documentation (here for example).

I hope this helps,
Emmanuel

Hi @emmanuel.krebs ~

I’m not sure to fully grasp your question.
If your question is: do all records need 5 levels of hierarchy, the answer is no.

There are 2 parts:
I think I have it figured out on mysql side:

20	02 Existing Conditions     Sections - 02	NULL	                                                NULL	
21	NULL	                   Sections - 02	02 05 00 Common Work Results for Existing Conditions	NULL                        
22	NULL	                   NULL	            02 05 00 Common Work Results for Existing Conditions	02 05 19 Geosynthetics for Existing Conditions

In order for the html selects to populate dynamically dependent they need at least one level up. I tested including each level but it was not necessary for the html selects to populate. ( the goal here is to minimize data entry.)

Part 2:
I meant, if a record is on the 5th level, does it need all other 4 upper levels? Headings, I can understand, would not need downstream levels.

"lvl2": ["Books > Science Fiction > Time Travel",

That shows Time Travel has Science Fiction and Books on the record. Or should this be 3 records?

Meaning an algolia record with id = 1, Books, algolia record with id = 2, Science Fiction and algolia record id = 3, Time Travel

Algolia’s docs talk about “getting” the data. But preparing on a UI on a laravel site requires some other tedious work. I’m trying to bridge that gap. Maybe we can do a joint post when this is all done?

I gave this a shot for searchable array.

    {
        $array = $this->toArray();

        $manufacturerProduct = $this->manufacturerProduct();

        $array['manufacturerProducts'] = [
            'lvl0' => $manufacturerProduct->masterformat_lvl_1,
            'lvl1' => $manufacturerProduct->masterformat_lvl_2,
            'lvl2' => $manufacturerProduct->masterformat_lvl_3,
            'lvl3' => $manufacturerProduct->masterformat_lvl_4,
        ];

        foreach ($manufacturerProduct->childrens() as $children) {
            $array['manufacturerProducts']['lvl0']['lvl01'] ['lvl02']['lvl03']= "{$manufacturerProduct->masterformat_lvl_1} > {$children->masterformat_lvl_2} > {$children->masterformat_lvl_3} > {$children->masterformat_lvl_4} >";
        }

        return $array;
    }

The above was psuedo code from another algolia forum post, but the loop won’t work, I believe, because manufacturer_product contains all the lvl's. Meaning there are no related models, in this case MasterFormat, that table is used to populate the html selects which then allow a user to select categories which their products belong. (not sure if that’s the most efficient way)

So, I gave this a shot:

    public function toSearchableArray(): array
    {
//

        $array = $this->toArray();

        $slice = Arr::only($array, ['id', 'company_id', 'brand', 'product_name', 'product_description', 'created_at', 'updated_at']);

        $slice['master_format'] = [
            'lvl0',
            'lvl1',
            'lvl2',
            'lvl3',
        ];

        $slice['master_format']['lvl0'] = "{$this->masterformat_lvl_0}";
        $slice['master_format']['lvl1'] = "{$this->masterformat_lvl_1}";
        $slice['master_format']['lvl2'] = "{$this->masterformat_lvl_2}";
        $slice['master_format']['lvl3'] = "{$this->masterformat_lvl_3}";

        return $slice;
    }

Algolia record result:

objectID "10"

product_name "A34"

id 10

company_id 21

brand "Simpson Strongtie"

product_description "Testtt"

created_at "2021-02-05T03:00:36.000000Z"

updated_at "2021-02-05T03:00:36.000000Z"

thumbUrl { thumbUrl: "/logos/BidBird_Logo.svg" }

master_format

{

    lvl0:  "01 General Requirements"

    lvl1:  "01 10 00 Summary"

    lvl2:  "01 11 00 Summary of Work"

    lvl3:  "01 14 13 Access to Site"

}

Will the above data structure work with hierarchical menu? — so close i think

Hi @BidBird,

Yes I believe you are getting there, but let’s first take a step back to validate what we’re trying to solve and how to best present it.

If I understood your issue correctly, the desired end result is an InstantSearch UI to display search results, such that on the side you have filtering menus, including a hierarchical menu. Is that correct?

I am going to make some additional assumptions, so please correct me if I am wrong: The UI you are building is for a documentation use-case, and this hierarchical menu would display a sort of navigation tree for documents to help users quickly find what interests them e.g. I start with a list of broad topics like “Concrete” or “Marble” which I can click on to display the next level of granularity within the documents like “Preparing Concrete”, “Maintenance of Concrete”, etc. Is my assumption correct?

We want the customers to be able to navigate down to fairly precise locations using this hierarchical menu. This means that the search engine should return records that are quite precise. From this we understand that this means we need to split our document into multiple records. We have a guide on what to take into account: Indexing Long Documents | How to | Sending and Managing Data | Guide | Algolia Documentation

Let’s now move backward from the goal towards the implementation. We know we want a hierarchical menu, so let’s have a look at the records design requirements for this widget: ais-hierarchical-menu | Vue InstantSearch | API parameters | API Reference | Algolia Documentation

It seems that what is required for this widget to work is to have fields the describe the hierarchy of the item in a rather verbose way. We now know that our format must be something like:

{
  objectID: "someObjectID"
  chapter: "General Requirement",
  section: "General Requirement > Summary"
  paragraph: "General Requirement > Summary > Important Note"
  content: "...",
  navigation: "01.01.14" // ignore this for now
  // and other relevant fields for display
}

This would be a good way of shaping the record. There are a few things I want to highlight here:

  • we do not care about the name of the hierarchical fields - they can be named according to what is most suitable for your use case. Here they are named chapter, section, paragraph, but they could be named “section.lvl0, section.lvl1, etc.”, or whatever suits your internal representation of the data.
  • what matters is that the value of a higher level field is included in the lower level field with a > separating them - like chapter: General Requirements that appears in section > General Requirements > Summary.

So we now have a good understanding of what we want:

  • small records that describe a small part of the document
  • that contains the full hierarchy and NOT just one level of the hierarchy.

Now we enter in your business domain, so I can only give suggestions and directions to look into rather than recommendations.

First, let’s have a look at the code you wrote about $slice. This won’t work with what we’ve outlined earlier on, but it should be fairly easy to make it work!

for instance if you change the way you define the master_format levels to this:

$slice['master_format']['lvl0'] = "{$this->masterformat_lvl_0}";
$slice['master_format']['lvl1'] = "{$slice['master_format']['lvl0']} > {$this->masterformat_lvl_1}";
$slice['master_format']['lvl2'] = "{$slice['master_format']['lvl1']} > {$this->masterformat_lvl_2}";
$slice['master_format']['lvl3'] = "{$slice['master_format']['lvl2']} > {$this->masterformat_lvl_3}";

You would get the following record format:

{
   ...
   master_format: {
      lvl0: "01 General Requirements",
      lvl1: "01 General Requirements > 01 10 00 Summary",
      lvl2: "01 General Requirements > 01 10 00 Summary > 01 11 00 Summary of Work",
      lvl3: "01 General Requirements > 01 10 00 Summary > 01 11 00 Summary of Work > 01 14 13 Access to Site",
   }
}

A small note: I would advise you to have the chapter/section numbers out of the hierarchical menu, and have only the most specific appear in the your record (like the navigation: “01.01.14” field I have described earlier), as this would potentially allow you to sort records so that they are in order of navigation.

Let me know if that helped you resolve your issue!
Best,

Hi @jonathan.montane great! This helps get that hierarchical data working right.

On the sorting, are you saying Algolia cannot sort something like 01 General Requirements ?

Because, it would ridiculous to have a dropdown for 01 then General Requirements … especially when they are on the same hierarchy.

The test below seems to sort things ok…

Screen Shot 2021-02-10 at 9.56.23 AM

That post shows all the sections I need to add to the mysql insert statement to populate the selects.

My use case will be to show products that fit in these categories. So a contractor can select one and populate a shopping cart type functionality.

Thank you ~

Hi @BidBird !

What my colleague @jonathan.montane meant is that as for clarity, it would be better to keep numbers out of your title so that your menu is easier to read for your end-user. What he added however with his example navigation was that you could use an extra field containing your level numbers, that would be hidden for your users but that would allow you to sort your data, without having to use numbers in the title. Does that make more sense?

Have a good day!

Hi @chloe.liban !,

I see. This hierarchical setup is an industry standard, so I’m not at liberty to change it.

Will Algolia will sort these strings based on the first numbers? It appears to be the case, but just want to be sure.

Working out Extracting hierarchical data with line wraps · Issue #582 · pdfminer/pdfminer.six · GitHub This project is just insane!

Basically, I’ve worked out a script (just a few more tweaks needed) that will read all the hierarchical categories from a pdf via python, strip the info based on character size lvl_1, lvl_2 etc, then populate a mysql insert to use in a dynamically dependent dropdown via Vue, have manufacturers save a product in Laravel mysql, which then pipes to Algolia NoSQL and displays for end users!

If you can see a way to reduce steps I’d like to know, but I haven’t found a way to reduce any of that.

Thank you ~