Use of the Oracle RDBMS in the CILKS Project
A relational DBMS was used for the project because it provided a flexible
storage structure for complex data, and a powerful high-level query
language (SQL), enabling the characteristics of the data to be thoroughly
explored. A data model was devised to ensure that all thesaurus
relationships could be properly represented and exploited. The following
tables were held in the database:
- Basic entities:
- Term(term_no, text, status, qualifier, no-of-postings_in_documents,
number_of_related_terms)
- Word(word, frequency, stem, suffix)
- Class_descriptor(class_code, descriptor)
- Thesaurus term relationships:
- Equivalence(preferred_term_no, lead_in_term-no)
- Hierarchical(broader_term_no, narrower_term_no)
- Associative(term-no1, term_no2)
- Term-word and term-class relationships:
- Component(term-no,word)
- Class(term-no,class-code)
Within this environment, breadth-first search through the thesaurus
network may be achieved with repeated joins on the relationship tables.
The interactive navigation function was
implemented using the Oracle Forms3 4GL application environment.
Details of all moves through the thesaurus, terms seen and terms
selected, were saved in further database tables, and SQL query / report
scripts used to generate navigation logs.
Likewise user relevance judgements on documents produced by original and
enhanced queries were logged in database tables for detailed analysis
using SQL query and report scripts. See relevance
statistics.