Abstract: Large Test Collection Experiments on an Operational, Interactive System: Okapi at TREC

The Okapi system has been used in a series of experiments on the TREC collections, investigating probabilistic models, relevance feedback and query expansion, and interaction issues. Some new probabilistic models have been developed, resulting in simple weighting functions which take account of document length and within-document frequency and within query term frequency. All have been shown to be beneficial. Relevance feedback and query expansion are highly beneficial when based on large quantities of relevance data (as in the routing task). Interaction issues are much more difficult to evaluate in the TREC framework and no benefits have yet been demonstrated from feedback based on small numbers of "relevant" items identified by intermediary searchers.