Relevance Statistics

A comparison was made between the performance of a) the original query and b) various forms of enhanced query, based on user relevance judgements. In a probabilistic system all retrieved documents are ranked, and the best form of a query is one which places most relevant documents near the top of the ranked list, e.g. in the first 20 positions. The following table shows, for users participating in the second evaluation experiment, the number of "hits" for four query types at various document levels.
              1            2           3         4
Document   Original    Thesaurus    Hybrid   Free Text
 Level      Query     Terms Only     Query      Only
 -----     --------   ----------   --------  ---------
     5         46          35          51         32
    10         85          68          91         66
    20        157         146         152        119
    40        175         167         184        152
    75        201         185         228        180
   150        215         207         267        213
   300        249         219         288        245
   500        260         235         305        270
 
The hybrid query scores better than the other two incorporating thesaurus terms, but is slightly worse than the original at document level 20.

More detailed comparisons between the hybrid and the original query indicated that:

For 50% of users the enhanced query was more successful, for 45% it was less successful, for 5% the performance was the same. When using the thesaurus, the successful users had been more selective in their choice of terms and tended to comment that their enhanced query had become "more specific".

Though the proportion of relevant documents found by original and hybrid queries was roughly similar, they were rarely the same documents. Query enhancement had a drastic re-ordering effect on the original ranked lists, finding or "promoting" new relevant documents while "demoting" others which had previously been near the top. It could not of course actually lose any relevant documents, but might push them below the point where they would be seen by a user interacting with the system under normal circumastances.