How to Search with dtSearch Web

Overview

dtSearch supports two types of search requests. A natural language search is any sequence of text, like a sentence or a question. After a natural language search, dtSearch sorts retrieved documents by their relevance to your search request.

A boolean search request consists of a group of words or phrases linked by connectors such as and and or that indicate the relationship between them. Examples:

apple and pear

Both words must be present

apple or pear

Either word can be present

apple w/5 pear

Apple must occur within 5 words of pear

apple not w/5 pear

Apple must not occur within 5 words of pear

apple and not pear

Only apple must be present

name contains smith

The field name must contain smith

If you use more than one connector, you should use parentheses to indicate precisely what you want to search for. For example, apple and pear or orange juice could mean (apple and pear) or orange, or it could mean apple and (pear or orange).

Noise words, such as if and the, are ignored in searches.

Search terms may include the following special characters:

?

Matches any single character. Example: appl? matches apply or apple.

*

Matches any number of characters. Example: appl* matches application

~

Stemming. Example: apply~ matches apply, applies, applied.

Words and Phrases

You do not need to use any special punctuation or commands to search for a phrase. Simply enter the phrase the way it ordinarily appears. You can use a phrase anywhere in a search request. Example:

apple w/5 fruit salad

If a phrase contains a noise word, dtSearch will skip over the noise word when searching for it. For example, a search for statue of liberty would retrieve any document containing the word statue, any intervening word, and the word liberty.

Punctuation inside of a search word is treated as a space. Thus, can't would be treated as a phrase consisting of two words: can and t. 1843(c)(8)(ii) would become 1843 c 8 ii (four words).

Wildcards (* and ?)

A search word can contain the wildcard characters * and ?. A ? in a word matches any single character, and a * matches any number of characters. The wildcard characters can be in any position in a word. For example:

appl* would match apple, application, etc.

*cipl* would match principle, participle, etc.

appl? would match apply and apple but not apples.

ap*ed would match applied, approved, etc.

Use of the * wildcard character near the beginning of a word will slow searches somewhat.

Natural Language Searching

A natural language search request is any combination of words, phrases, or sentences. After a natural language search, dtSearch sorts retrieved documents by their relevance to your search request. Weighting of retrieved documents takes into account: the number of documents each word in your search request appears in (the more documents a word appears in, the less useful it is in distinguishing relevant from irrelevant documents); the number of times each word in the request appears in the documents; and the density of hits in each document. Noise words and search connectors like NOT and OR are ignored.

Stemming

Stemming extends a search to cover grammatical variations on a word. For example, a search for fish would also find fishing. A search for applied would also find applying, applies, and apply. There are two ways to add stemming to your searches:

  1. Check the Stemming box in the search form to enable stemming for all of the words in your search request. Stemming does not slow searches noticeably and is almost always helpful in making sure you find what you want.
  2. If you want to add stemming selectively, add a ~ at the end of words that you want stemmed in a search. Example: apply~