Simplifying Complexity with Boolean Options

by aaron.weber on December 6, 2011

This is part 3 of 3 in a series on building better searches
Part 1: Why Simple Answers Require Complicated Questions
Part 2: Enter Boolean Operators


We’ve introduced some easy concepts for gathering business intelligence through web searches. Now we’ll go a step further and learn how a combination of Boolean operators and grouping can help you weed out non-relevant web posts. In the example from Part Two, we were looking for web mentions of Diet Coke or Diet Pepsi.

Let’s say there’s a popular blog personality who goes by the handle of Diet Coke Fiend, and name aside, that blogger rarely mentions the product. We’d rather not spend our time sifting through all those non-relevant tweets and blog posts, so our query becomes:

“Diet Coke” AND NOT “Diet Coke Fiend” OR “Diet Pepsi”

This is one way to exclude Diet Coke Fiend’s posts, but it’s only excluding the results that also mention Diet Coke. If we were translating the query above into language, we’d say “I want to see pages that contain ‘diet coke’, but do not contain ‘diet coke fiend,’ or you can show me pages that say ‘diet pepsi.’ I’m good with either.” Unfortunately, there is no exclusion in this query when it comes to ‘Diet Pepsi,’ so ‘Diet Coke Fiend’ will show up if he has blogged about Diet Pepsi.

But what if our pal Diet Coke Fiend talked about Diet Pepsi all the time and we don’t want to see those posts either? We could still use the same query as before, right? Unfortunately not, because while OR lets us search the same content for different things, OR also acts like a bit of a wall — stopping a thought and beginning a new one.

Computer software will read our string literally, meaning that it would see you asking for posts that contain Diet Coke (but not Diet Coke Fiend) or Diet Pepsi (but Diet Coke Fiend can show up too). We have to be explicit — so in order to get rid of our friend when he mentions Diet Pepsi, our query would become:

“diet coke” AND NOT “diet coke fiend” OR “diet pepsi” AND NOT “diet coke fiend”

Got it?

Simplicity_by_osrekWhat if we wanted to look for cherry-flavored varieties as well? As you can see, a search string can get complicated very quickly, even though what we want to find can be simply stated. As any frustrated infomercial actor would exclaim, “There has to be a better way!”

Thankfully, there is. OK. Let’s get crazier and add another concept to our queries to get more specific: We can apply operators to multiple terms and phrases via grouping.

Grouping is the logic equivalent of multi-tasking. We can now very simply put terms together in more and more complicated ways to achieve better results. Not only is it more efficient, it also cuts down on errors by allowing us to simplify the phrasing of a question while keeping the complex meaning.

If we threw ‘cherry’ into our query above, we’d be looking at something like this:

“diet coke” NOT “diet coke fiend” OR “diet coke” AND cherry NOT “diet coke fiend” OR “diet pepsi” NOT “diet coke fiend” OR “diet pepsi” AND cherry NOT “diet coke fiend”

See how we have to explicitly spell out each variant? And see how we’ve got more and more opportunity to make a mistake? Lame.

Let’s see what that looks like if we put grouping to use. If we want to find pages that talk about Diet Cherry Coke or Diet Cherry Pepsi:

(“diet coke” OR “diet pepsi”) AND cherry NOT “diet coke fiend”

We can even simplify that further because the word ‘diet’ has to be there, no matter if it’s Pepsi or Coke! Like so:

(“diet (coke OR pepsi)”) AND cherry NOT “diet coke fiend”

If you want to find web pages that mention Diet Coke, Diet Pepsi or their cherry varieties? Hello again, sub-grouping!

(“diet (coke OR pepsi)” AND cherry OR “diet (coke OR pepsi)”) NOT “diet coke fiend”

Looks pretty crazy, I know, but what we’ve done is tell the software that we want Diet Coke or Diet Pepsi results (which may or may not include the word ‘cherry’), but not results that contain ‘diet coke fiend’. And we did it with far, far less hassle than if we’d had to explicitly type out each variable. Ah, the magic of efficiency!

There is in fact a wealth of additional options and parameters we can make use of in building our searches. Simple operators and grouping are just the tip of the iceberg when we’ve got word proximity, boosting, field searching, wildcards and more to play with, but hopefully this post has given you some insight on how you go about making better queries in the search for the best data. Even if you’re using a search that can’t utilize these features, it’s supremely helpful to know how to frame a question in your head before-hand when time is not always on your side.

Now we’re in a better position to build smarter searches from the get-go, or even go back and edit our searches if new information comes up. What happens if Diet Coke Fiend’s blog now has a co-author going by the name “Diet Pepsi Challenged”? Now we now how to add in that exclusion without having to create a brand-new query! “Mountain Dude” started writing? Just add it to the NOT string!

Time, thou art saved.

As you can see, Boolean operators and grouping are wonderful ways to cut down on waste and eliminate chances for error. More importantly, they’re absolutely essential in weeding out irrelevant material when asking complex questions. At the end of the day, our data-driven decisions have to be powered by the best data. This means you’ll get better results when searching.

Trial and error in building queries and search strings can lead to discovery, but it can also waste valuable time and effort that would be better spent analyzing the best results.

┬áImages: foodaddictionconfessions, Aminia’s Voice

{ 0 comments… add one now }

Leave a Comment

Previous post:

Next post: