Text searches

Use of wildcards in keyword searches

Wildcards can be used to maximize the effectiveness of your searches.  They are simply symbols used in a search term to represent one or more characters.  

Different search engines use wildcards differently.  

In the Lens, the two most common ones are:

1.”*” (asterisk) symbol used to specify multiple characters. Typically, it is used at the end of a stem word and it is then referred to as “truncation.”   This is most useful when you want to search for variable endings of a stem word (treat* or develop*), concepts with similar spelling, plurals, or misspelled words (center and centre).

Compare searching :

Title: treatment cancer oligonucleotide

To

Title: treat* cancer* oligo*

treat*: matches to treatment(s), treating, treats, etc

Cancer*: matches to cancerous, cancers, etc

Oligo*: matches to Oligomer(s), oligonucleotide,oligos, etc

  1.  The second wildcard symbol used is a “?” (question mark).  It is used to represent a single character, anywhere in the word.  It’s most useful when there are variable spellings for a word, and you want to search for all variants at once.  

Ex: Title: Ne?t or Labo?r

As wildcard queries can’t be used as part of phrase queries – only for single terms like ‘develop*’ Here are some basic query rules for wildcards to remember

  1. Wildcards symbols are ignored in phrases
  2. Wildcards can NOT be used as the first character of the search term ‘*evelop’
  3. Wildcard terms are not stemmed.  

But because in the Lens, search terms are stemmed by default (which means that the search index takes into consideration a root form of a word), and the rule states that wildcard query terms are NOT stemmed,  this makes the use of wildcards unintuitive in the Lens.

Let us consider some examples to explain the variations that can be found with wildcard queries when the Stemming in the Lens search Index is ON or OFF.  

Wildcard query Stemming ON Stemming OFF
Title:developing* 11

This term has a wildcard (so it is not stemmed based on the rule and would only match words which start with developing and that have not been stemmed to develop. It WILL NOT match ‘developing’  in the index as it would have been stemmed to develop, but will match the typo developingbrick

75,817

This search term will match any word starting with developing

Title:develop* 164,960

develop* has a wildcard, is not stemmed but is roughly equivalent to searching for all terms which stem to develop as it will match any text stemmed to or starting with develop, including the typo developd

164,960

develop* has a wildcard, is not stemmed but it will match any text starting with develop

Title:”develop*” 163,944

The wildcard modifier is ignored in a phrase, and this search will not bring more results compared to Title:develop where you capture more endings to the root word.

1,217

The wildcard modifier is ignored in a phrase, but as stemming it off,  this search will only do exact match to the word develop