>

What is Probabilistic Retrieval?

In probabilistic retrieval, a numeric weight is assigned to each retrieved document as an estimate of the probability that it is relevant to the query. Documents are presented to the user in weight order, so those which are considered to be the best match come at the top of the hit list.

The two most important factors in the weight calculation are:

When retrieving from databases of long documents, two other factors are considered: Documents retrieved by the original query are presented to the user, who is asked to supply relevance feedback about them. Given a set of relevant documents, the original query terms are re-weighted (and additional terms extracted) in order to identify other similar documents. This is an effective method of query expansion.

The probabilistic retrieval system used for the CILKS project was OKAPI.