SOCKER Kernel Evaluation: an Analysis of Searching Behaviour,
User Perceptions and System Performance
Introduction
This project involved the Centre acting as a test site for an EU funded project by carrying out user operational testing
for the SOCKER kernel based on the Z39.50 protocol. The test
focuses on the facility of the system to search different servers
or targets with a range of databases from two remote clients in
a networked environment.
The user interfaces of the two clients were compared, including
a graphical user interface for the Information Services Workstation
(ISW), and a command based system for the Network Service Point
(NSP). In ISW servers are grouped into various sets described
as 'profiles';. Users are offered a choice of profile
to search within which individual servers may be selected. In
NSP users are directly presented with a single list of servers
to choose from. The experiment required users to execute three
information retrieval set tasks on one of the two clients to
test specific aspects of user behaviour and system functionality.
The aims, objectives and methodology of the experiment are briefly
described below.
Aims and objectives
The aim of the evaluation was to test the implementation of a
system allowing for multi-target searching from a common client
interface over a communication network. Specific evaluation objectives
were as follows:
- To compare
the ease of use of the different client interfaces for query formulation
and searching in a distributed environment,
- To test the
system's retrieval effectiveness for searching multiple databases,
and
- To determine
the user satisfaction and perception of the system.
Methodology
Three search tasks were set to test the different aspects of
the system functionality. User behaviour in formulating queries
was monitored quantitatively, as far as possible from transaction
logs, complemented by questionnaire data and comments from test
administrators. More qualitative data and analysis was also provided
from direct user observations made by the administrator at the
City University site.
Search tasks tested a combination of features including:
- searching
for an item unique to a small number of targets, across many targets;
- finding duplicate
records from various targets to locate different editions of a
work;
- searching
for a multi-faceted topic in several targets.
Data was collected and analysed from three sources:
- Questionnaires: these sought information about users experience
and familiarity with different online systems, how the various
tasks were undertaken and users' reactions to the system interfaces
and facilities.
- Transaction logs: these gave details of servers selected
and accessed, connection times and search formulation details.
The latter included terms used, and term modification and search
fields employed.
- Observation: this method includes notes made by site co-ordinators
and the research assistant at City University who administered
the test. Much of this material is qualitative, but the findings
made by observation were complemented by the quantitative data
derived from the logs. Because the data set collected at City
University was more complete, it was possible to undertake more
detailed analysis on this sub-set.
Conclusions and recommendations
The operational pilot testing clearly demonstrated the successful
implementation of the SOCKER Z39.50 kernel. However in assessing
the viability and quality of such a service for future development,
other interrelated factors needed to be taken into consideration
at different levels. These included:
- The mapping
of record formats and searchable fields across multiple databases,
- Interface
features to support effective query formulation and searching,
- The division
of labour between the client and the server in the network environment.
In addition the pilot test also raised questions regarding the
experimental design and methodology of the evaluation.
Experimental design methodology
There were a number of constraints in the test administration
and data collected which need to be overcome in future trials.
Firstly, greater effort should be made to recruit end users as
test subjects as well as library professional staff, since presumably
GUI clients are primarily aimed at end users. Secondly, data collection
necessitates not only more automatic methods but also complementary
methods for eliciting information from users, through online
questionnaires and transaction logging.
User performance and search tasks
The three search tasks set for the evaluation experiments were
intended to represent standard bibliographic enquiries and the
given order was based on an increasing degree of difficulty. Contrary
to expectations Task 1, the specific item search, turned out to
be the most difficult, with a substantial number of users failing
to complete it. Conversely, Task 3, the subject search, was the
easiest, with the majority of users succeeding. It is possible
that the considerable effort required for Task 1 served to familiarise
users with the system, to the benefit of subsequent tasks. However
it appears that this environment is more amenable to a general
browsing task than to a specific item search, unlike searching
individual library catalogues.
Test subjects experienced problems in searching multiple servers
and databases for two main reasons:
- They were
unable to make an informed choice on the most appropriate server/database
to search because of the lack of information on the sources available
as well as the random presentation or listing of the profiles,
servers and databases.
- They could
not specify query terms which readily matched appropriate searchable
fields in the bibliographic record. This was particularly problematic
for end users.
It would appear that the logical grouping of available servers/databases
is a paramount requirement. Even though defaults should probably
be set to reduce the effort required by users in making choices,
searchers need to have some flexibility and be able to easily
intervene and tailor their own selection if necessary.
Although search intermediaries and end users may be aware of the
fact that there isn't a common approach to searching different
databases due to variations in indexing policies, the inconsistencies
are greatly magnified in attempting to search multiple sources
from a client with a common interface. The poor mapping of searchable
fields in spite of the standard MARC format remains a major obstacle
for effective searching.
Interaction and the client interfaces
The searching behaviour of the test subjects and the ease in which
they were able to carry out the tasks was influenced by the different
interface environments of the two clients. Although the test subjects
of the NSP and ISW clients were primarily experienced with the
interaction styles of the respective command and windows based
interfaces, some contrasting responses were observed or expressed.
- The number
and presentation of options for selecting multiple servers/databases
and choices for formulating queries was more complex for ISW than
for NSP searchers and was not necessarily beneficial.
- Feedback information
and concurrent displays made it much easier for ISW searchers
to keep track of the system's searching activity and contributed
to greater user satisfaction.
- Search response
times for the ISW local client were more acceptable to searchers
than those of the remote NSP client.
The increased number of options which can be presented to a user
in a GUI environment can easily increase the cognitive load on
the user and impede user/system interaction. More attention needs
to be paid to striking a balance between the user/system control
in the search dialogue and process, and to determine to what extent
the system's operations are made apparent to the user.
System performance
The general
impression is that the system does provide a usable search facility,
but that there is considerable redundancy which might affect its
viability if adopted on a large scale. Some of the things which
appear to be happening are:
- multiple associations
in force for the same server simultaneously,
- associations
remaining in force when no longer required,
- the same database
being searched on different servers,
- searches on
inappropriate record fields.
The questionnaire data confirms that users in general had little
understanding of the implications of their decisions regarding
choice of server, for example, and were not aware when it gave
rise to duplication of effort. Some of these problems could be
solved by giving users more information and more control, for
instance by allowing them to terminate unwanted associations explicitly.
However this approach is somewhat at variance with the presumed
objective of the system being evaluated, to provide users with
a simple front-end and make sensible decisions in the background
on their behalf. So the better way forward may be to build more
intelligence into the client software, to eliminate some redundant
message interchanges and reduce network traffic accordingly.