Evaluation Experiments
Controlled experiments were conducted with users (members of staff and
students within the School of Informatics at City University) who wished
to find relevant references in the INSPEC document database. Users were
asked to select thesaurus terms to enhance their query, then to provide
relevance judgements on sets of documents retrieved by a) the original
query, and b) various forms of enhanced query.
Data was collected automatically, in the form of
navigation logs, but also by short questionnaires administered to
users before during and after their work with the thesaurus. Three
separate experiments were run, with about 50 participants in all.
- Experiment 1:
- finalised the thesaurus navigation logging procedure, and the form
of the reports to be generated,
- identified user problems about thesaurus concepts and the navigation
interface,
- found that "enhanced" queries consisting only of
controlled-language thesaurus terms would not on average outperform the
original query.
- established the need for user relevance judgements to be made on
"pooled" document lists, to eliminate order effects.
- Experiment 2:
- presented a more robust and convenient user interface, based on
feedback from experiment 1,
- provided additional information (abstracts) from retrieved
documents for user relevance judgements,
- investigated the performance of "hybrid" queries consisting of
both original query terms and those extracted from the thesaurus,
- logged all individual relevance judgements in the database for
detailed analysis of the effects of query enhancement (see relevance statistics).
- Experiment 3 focused on the the handling of multi-faceted queries.