Category
Software Development
This KnowledgeBase item gives a very
good explanation as to how the Fuzzy Search option works in Notes, and
how you can manipulate it to get the level of "fuzziness" you
want...
How Does the Fuzzy Search Option Work?
Document Number:
1088269
Problem
In the Notes client, the Fuzzy Search option
is available on the Search Bar of a full-text-indexed database. What
type of results does the Fuzzy Search option generate?
Content
The power of Fuzzy Search is to find results
that are not an exact match to the query term. Fuzzy Search logic
can be thought of as an "Expanded Or" search that allows users
to find as many of query terms as possible but not necessarily all of them.
Fuzzy Search logic performs searches based on the similarity of character
string but not based on meaning. The Fuzzy Search logic used by the
Notes Client allows specifically for text searching in which the logic
has the ability to recognize incomplete hits in a document's text.
This can be very important in the context
in which documents contain errors or variables in the words or terms. For
example, documents that are converted through optical character recognition
(OCR) may have many unrecognized characters within the words. These
types of errors can not be completely compensated for by wildcards and
word stemming because there is no way to predict where the errors may occur.
If the error occurs in the stem word, the wildcard character and
word stemming methods are ineffective. Fuzzy Search logic determines
that, if the hit term has at least some of the characters of the query
term, it may be a valid hit.
Fuzzy Search finds matches using the base
word described in the Notes Client Help under the topic "Word Variants"
(as shown in the Supporting Information section below). The size
of the base word is determined by the parameter Matchinglevel but must
be a minimum of three letters long and starts from the left side of the
query term. The Matchinglevel parameter determines what percent of
the word needs to be matched. The
default value for this parameter is 75%.
Using the example "Rossberg", 75%
of the word to match would be Rossbe. If a user types in Rossburg
with an "u", documents containing Rossberg with an "e"
are not returned because Rossberg does not match 75% of the base word.
To change the Matchinglevel parameter, in
a Search bar, type in the following:
matchinglevel XX searchword
where XX is a number between 5 and 95, and
"searchword" is the word to match. Typing zero for this
number yields zero matches for the base word, and 100 yields only exact
matches. Matchinglevel and the number must be entered to the left
of the word to match.
In the case of double letters, a word will
be returned if one of the letters is missing. Using the Rossberg
example, searching on Rosberg with one "s" will return documents
containing Rossberg with two "s".
NOTE: Fuzzy Search is not designed
to work on small words. Using the example of searching for the names
James and Janet, if Janes is the search word and matchinglevel is set to
40%, the resulting base word is "Ja". Fuzzy Search will
not work with a base word of this size.
Supporting Information:
Examples of different types of search done
by Fuzzy Search:
Various expression of phrases
Search for: new technology
Returns: new CMOS technology
Search for: user requirement
Returns: user group requirement OR
user has a requirement
Incorrect Writing
Search for: Califorrnia (misspelled)
Returns: California (spelled correctly)
Search for: Palalto (misspelled)
Returns: Palo Alto (spelled correctly)
Example of Inflection Searching
Search for: communication
Returns: communicate OR communicating
OR communi-cation
Search for: Study
Returns: studies OR studied OR studio
Note: for Fuzzy Search results on query
term Study, studio is a valid result but may not be a result expected by
the user.
Alternative expression
Search for: database
Returns: data-base OR data base
Search for: run-time
Returns: runtime OR run time
From the Notes Client Help (R5):
Use Word Variants
This option finds words with the base word
+ certain suffixes. For example, a search for "swim" will also
find "swims," "swimming," "swimmer," and
even "swimmed." It will not find the variation "swam,"
however, because the base word has changed, or "swimmet," or
"swimsed" because the suffixes are not acceptable with
that word.