Interactive Thesaurus Navigation Procedure

The user enters a query, then goes through the thesaurus to pick out potentially useful terms to enhance it. Information about each term, e.g. status, number of document postings, number of related terms, is always shown to assist the user's choice. If the query combines two or more separate topics (facets), each one may be explored individually. Each phase of the interaction is presented on an Oracle Form.

The first phase involves matching the initial query with one or more thesaurus terms. Since it is rare for a real query to match any term exactly, a best match algorithm is used whereby terms are presented for selection in weight order. As in probabilistic document retrieval, weighting is based on the words which the thesaurus term has in common with the incoming query, and the (inverse) frequency of those words in the thesaurus database.

See an example form showing terms matching an initial query.

Some matching terms are lead-ins: if one of these is chosen a list of its corresponding preferred terms is presented in a pop-up window for selection.

See an example form showing preferred terms for a lead-in.

Once chosen, any term can be used for expansion, i.e. all terms linked to it are displayed for selection, grouped according to relationship type. Terms already chosen are marked with an asterisk, and cannot be reselected.

See an example form showing term expansion.

As terms are chosen, they are placed in a "pool" which can be viewed by the user, and from which other terms for expansion can be selected. (They may also be deleted at this stage.) They are grouped by facet number, then ordered and indented so that the "path" which was followed to arrive at any particular term can easily be seen. Terms which have already been expanded are marked with an 'E'.

See an example form showing the current pool of terms.

The navigation process continues until the user is satisfied that all relevant terms have been found. The whole collection is then exported to a file for submission as a query to the document database.