Algolia suffix search with Underscores

I am familiar with Algolia’s help document (https://www.algolia.com/doc/tutorials/indexing/advanced/how-can-i-make-queries-within-the-middle-of-a-word/) and how to leverage another attribute for suffixes. As stated in the document, suffixes work great for things like part numbers.

Below is an example of a single record in Algolia

{
  "id": 19,
  "code": "AA6340-02-0.250",
  "codeSuffixes": [
    "A6340-02-0.250",
    "6340-02-0.250",
    "340-02-0.250",
    "40-02-0.250",
    "0-02-0.250"
  ],
  "partType": "spacer",
  "partFamily": "spacer",
  "description": "3 16in aluminum rohs round spacer 1 4in long 0.088 0.098 clearance range unplated",
  "objectID": "66300070"
}

Typically our web users would perform a power search like:

Search > "__6340-0_-0.250"

The return result from Algolia should be the the record above, id 19. However that is not the case. I get an empty result set. What do I have to provide to Algolia to accomplish that?

Just to get a bit more info, you want to use _ as a “placeholder for any character”?

@hareon - Thanks for getting back with me in this. The short answer is, yes. A placeholder for any character.

Allow me to elaborate a bit more on the situation. The number listed above is a part number/sku in our catalog. This particular number can be broken down into different sections

Characters 0-1 can be; “AA”, “AL”, “BR”, “NY”, “SS”, “ST”
Characters 2-5 can be; “6340”, “6341”, …, “6358”
Character 6 is always a ‘-’
Character 7-8 can be; “00”, “02”, “04”, “06”, “08”, “10”
Character 9 is always a ‘-’.
Characters 10-14 (possible 15) can be “0.187”, “0.250”, “0,312”, …, “10.000”.

As you can see, there are a good many possible combinations that a user might want us use. In our world with our database users are very familiar with using “_” as wild card placeholders.
I do have regular expressions written for our different part number/sku ordering format. In this particular example the regular expression would be

/(?<Material>[A-Z]{2})(?<PartBasePart>[0-9]{4})-(?<ClearanceHole>[0-9]{2})-(?<Measurement>[0-9]*\.[0-9]{1,3})-(?<Finish>[0-9A-Z]{2,4})/

Sorry. Unfortunately, Algolia cannot handle the search pattern that you describe as of today.

@eunice.lee - Thanks for the input. Even if I could list all possible combinations under codeSuffixes for each sku section, is it still not possible?

The engine should be able to return the result set with the full query without the wildcard placeholder “_”.

For instance, instead of running a search like " _ _ 6340-0 _ -0.250", you can perform a search like “AL6340-02-0.250” to match on code “AA6340-02-0.250” as long as you list all possible combinations (ie. AL6340-02-0.250, BR6340-02-0.250, NY6340-02-0.250, AA6341-02-0.250 etc.) in its code suffixes under code “AA6340-02-0.250”.

Alternatively, if passing on wildcard indicator is a must, you would need to modify the method you pass on the “_” and the query to Algolia. The way you currently indexed the records can stay the same. (Note: this is not the preferred way as it introduces some complexity due to the way we handle special characters such as _ and - with numerical tokens)

If underscore _ occurs between:
Characters 0-1 | Character 7-8 | Character 10-14 => never pass in a partial wildcard like “_ 0” or “0_". you should always pass a single _
For instance, let’s say if you are to run a search with "__6340-0 _ -0.250”, you may pass “__6340-_-0.250” instead.

Character 2-5 => you should try to avoid passing in wildcard. If necessary, you should always pass on the last few characters without the wildcard such as "_ _ 6340-0 _ -0.250” or "_ _ 340-0 _ -0.250” or “_ _ 40-0 _ -0.250” (but not " _ _ 6_40-0 _ -0.250”)

@eunice.lee,

Per your recommendations I’m attempting to rework some of the underlying record attributes. Review the emailed sample.json file I sent you and Chris. It would appear that I’m missing something because no matter what I do the file is not searchable for partial attribute values.

In the attached image you can also see that I’m attempting to search for all attributes that I want ranking for.

@eunice.lee - I’ve been able to get closer to my desired output. What I was able to do was add some query parameters that restricted by search results to how I’m expecting it.

index.search("S 6341 08 1.12", {
 "hitsPerPage": "10",
 "page": "0",
 "restrictSearchableAttributes": "material.code,clearanceHole.code,basePartId.code,measurement.code",
 "analytics": "false",
 "attributesToRetrieve": "*",
 "distinct": "5",
 "facets": "[]"
}); 

Also, in our world the first two characters are always followed by a combination of number, and possibly characters. By modifying by attributes I was able to get even closer to my desired results. Below is a single example of what I did. Take notice as the new attribute of materialBasePartId.

[
  {
    "code": "BR6340-02-0.125",
    "description": "3/16\" Brass Round Spacer 1/8\" long 0.088-0.098 clearance range",
    "image": "https://lyntron-images.s3.amazonaws.com/products/spacer/iso/round_brass_spacer_medium.png",
    "url": "3-16in-brass-round-spacer-1-8in-long-0.088-0.098-clearance-range-unplated/pn/BR6340-02-0.125-00",
    "material": {
      "id": "2",
      "code": "BR",
      "description": "Brass",
      "status_id": "1",
      "is_web": "1",
      "filter_uri": "brass",
      "rank": "10"
    },
    "basePartId": {
      "id": "1",
      "code": "6340",
      "is_web": "1",
      "status_id": "1"
    },
    "materialBasePartId": "BR6340",
    "clearanceHole": {
      "id": "3",
      "code": "02",
      "description": "0.088\"-0.098\"",
      "tolerance_low": "0.000",
      "tolerance_high": "0.010",
      "filter_uri": "0.088-0.098",
      "rank": "30",
      "is_web": "1",
      "unit_of_measurement_id": "1",
      "tolerance_id": "3",
      "screw_id": "8"
    },
    "measurement": {
      "id": "291",
      "code": "0.125",
      "fractional": "1/8",
      "decimal": "0.125",
      "unit_of_measurement_id": "1",
      "millimeter": null,
      "is_web": "1",
      "rank": "50",
      "description": "1/8\"",
      "filter_uri": "1-8in"
    },
    "screw": {
      "id": "8",
      "code": "2",
      "description": "#2 UN",
      "status_id": "1",
      "is_web": "1",
      "filter_uri": "2-un",
      "rank": "15",
      "unit_of_measurement_id": "1"
    },
    "outsideDiameter": {
      "id": "4",
      "description": "3/16\"",
      "rank": "20",
      "filter_uri": "3-16",
      "is_web": "1",
      "status_id": "1",
      "unit_of_measurement_id": "1",
      "unit_of_measurement_suffix_id": "1",
      "decimal": "0.187",
      "code": null
    },
    "partType": {
      "id": "1",
      "part_family_id": "1",
      "code": "SPR",
      "description": "Spacer",
      "status_id": "1",
      "is_family": "1",
      "is_web": "1",
      "filter_uri": "spacer",
      "regex_format": "/(?[A-Z]{2})(?[0-9]{4})-(?[0-9]{2})-(?[0-9]*\\.[0-9]{1,3})-(?[0-9A-Z]{2,4})/",
      "rank": null,
      "is_list_viewable": "0",
      "customize_me_layout_id": "4",
      "is_customizable": "1",
      "isDescriptionSingular": "0"
    },
    "profile": {
      "id": "2",
      "code": "RD",
      "description": "Round",
      "is_web": "1",
      "filter_uri": "round",
      "rank": "10"
    },
    "unitOfMeasurement": {
      "id": "1",
      "code": "IN",
      "description": "Standard",
      "is_web": "1",
      "filter_uri": "inch",
      "rank": "5"
    }
  }
]

However the next problem is how I can deal with dashes. In our part number (SKU) we separate our code with dashes. I find that everything works as desired when I remove the dashes from my query. But, I can’t tell users to omit the dash when they type it it. The dash is a critical aspect of the part number, but won’t work with Algolia and the query. I found and followed this article but I’m receiving an error when attempting to apply it.