School of Information Management & Systems
Previously School of Library & Information Studies
University of California at Berkeley
OASIS Research Program Overview
Principal Investigator:
Michael Buckland
See also
OASIS homepage and
OASIS publications.
In the new networked environment multiple computers and multiple databases can be
used in conjunction. OASIS is a long-term research program exploring this "extended retrieval"
through analysis and prototyping.
OASIS is actively contributing to and is in large part funded by Berkeley's
Digital Library Project
funded by NSF, ARPA, and NASA. Also funded by the U.S. Department of
Education under the Higher Education Act II and DARPA
funding.
- Front-end prototyping
- Strategic search commands
- Vocabulary issues: Entry vocabulary; using unfamiliar vocabularies
- Network searching
- Filing, ordering, sorting and collection development
- Analysis of retrieval
- The relationship between Recall & Precision
FRONTEND PROTOTYPING
A workstation can be used as a "frontend" to:
- To prototype enhancements to an online retrieval service, such as the MELVYL online catalog, through the pre-processing of queries and post-processing of retrieval results in the frontend.
- To generate simultaneous searches to multiple databases ("CHECK" command) and to combine diverse retrieval results.
STRATEGIC SEARCH COMMANDS
Early OASIS research concentrated on the problems of coping with excessive and inadequate retrieval results with Boolean searches in a MELVYL online catalog.
- The "FEWER" command helps reduce overly large retrieval results by
facilitating the use of Boolean AND commands to limit by date.,
language, location, and format.
- The "SUMMARIZE" command (aka AGGREGATE, ZOOM) analyzes and presents an
aspect of a retrieved set (e.g. distribution of subject headings) to
facilitate the selective expansion of a search.
VOCABULARY ISSUES
How to choose the right subject heading to use? Building on research
in the CHESHIRE project, a "vocabulary index" accepts natural language
terms and generates a ranked list of subject headings most likely to
be useful. A preliminary version for the INSPEC subject headings has
been developed as part of the Computer Science Technical Reports
project.
A Vocabulary Index has three uses:
- As a prompt when searching an unfamiliar vocabulary
- As computer-aided indexing
- To extend searches, using title and abstracts of found records as a basis for finding similar records in the same another database.
NETWORK SEARCHING
The "CHECK" command sends simultaneous, identical search queries to
multiple databases and reports the number of hits in each as a basis
for deciding which database(s) to search.
ORDERING IN INFORMATION RETRIEVAL
Alpabetical ordering, dominant in card catalogs, has been carried over
into online catalogs with deleterious results. Alternatives, such
document ranking, subset ranking, and adaptive filtering, can yield
striking improvements when searching large online catlaogs. See
"Filing, filtering and the first few found" INFORMATION TECHOLOGY
AND LIBRARIES 12 (Sept 1993): 311-319.
THE COMPONENTS OF RETRIEVAL AND FILTERING SYSTEMS
What are the functional components of retrieval and filtering systems?
Analysis reveals that all selection systems are composed of sets of
representations of data objects and functions operating on them to
transform members of the set or to change the way they are ordered.
See "On the construction of selection systems" LIBRARY HI TECH
48 (1994): 15-28.
THE INVERSE RELATIONSHIP BETWEEN PRECISION AND RECALL.
For thirty years an empirical trade-off has been found between Recall
(completeness of retrieval of relevant items) and Precision (Avoidance
of retrieval of non-relevant items). For an explanation of why this
happens and how two-stage retrieval techniques can improve both
precision and recall see "The relationship between recall and
precision"JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION
SCIENCE 45 (1994): 12-19.
Back to
OASIS homepage.