It’s not a fact of having results or not, or breacking the search. It would be perfectly ok if someone play with the url and have no results
But first of all, ANYBODY can forge a url, and give that to anybody, it’s maybe not your user who change the url, but a nasty spammer, on it’s website, forging bad urls pointing to YOUR website and having them crawled by google bots, for example. So bad urls are indexed by google
(you can search for OPSS030 for example in google, you’ll find a lot of results, leading to a lot of different website, including the one I’m working on). And because the get args are injected as it in the page (well, it’s hopefully stripped out of html), it could lead to bad things.
- misleading ads for random websites on YOUR website. I used ThisIsNotSomethingWeWantToSee in my example, but spammers usually use nastier things like “If you want PORN go to xxx.website.com”, and we don’t want that to appear in our website, ever
- then it’s indexed by google (we use prerendering, so google DO index the content of our full client rendered page)
- someone type “PORN” in google, and our website is the result list
My main concerns right now is that our pre rendering tool is full of bad urls on our search. They are mostly harmless because they don’t use any of the get arguments that are injected in the page, it’s just costing us more money because we have LOTS of those. I’m going to fix that by prevent pre rendering and caching if the url don’t use correct get args NAMES. BUT if the spammers inject data using correct args names and incorrect values, I’m screwed.
To summarize and answering your question, we already use OR, so non existing facet values are ignored in the search.
Here is a real live url we are using (use desktop, UI is a bit different on mobile):
-> will show a list of online courses about “Design”
-> will show a list of online courses about Design OR Pedagogy
-> will show a list of online courses about Design OR Nasty%20Strings (which is not a valid category…)
Look at the ui:
Correct strings for categories can change in time, and vary with selected language, so I can’t strip out Nasty%20Strings easily, unless I request Algolia beforehand to know all correct values