go to UC Berkeley home page go to SIMS home page
 
Overview 

Assignments  

Lectures  

Administrivia  

Readings  

Online Resources 

SIMS 202 Information Organization and Retrieval  

Assignment 7 


Assigned 10/14. Due 10/26
Readings:   Modern IR, Ch. 1, 4.1 - 4.3

The goals of this assignment are to:

  • Give you experience with sophisticated query languages.
  • Have you consider how search results differ based on how a query is formulated.
  • Have you consider how search results differ based on which collection is searched.
  • Give you more experience learning a complex interface quickly.
You may even get to research something that you are interested in!

This Lexis-Nexis interface is a conversion from their TTY interface into a graphical one. I believe there is also a web-based interface but I think it is a useful exercise to try this one out. The questions below are intended to get you started and give you some guidance.. Mainly you should view this as an opportunity to learn by doing and exploring. Feel free to talk with other students about it but do your own work. It should be fun!

Logistics

First, you need to get a Lexis-Nexis account and booklet from Roberta. (Once you have an account, you can use the software in the lab, or copy the software from the SIMS Resources directory or the web. Ask Roberta for details.) Note: the last part of this assignment has to be done from a machine connected to the UCB intranet.

From the lab machines, you can invoke the Lexis-nexis program from Start Menu -> Programs -> Research & Analysis -> Lexis-Nexis 7.12. Sign on using the account number that Roberta gives you. While you're doing the exercises below you might get signed off. Don't worry about it if this happens, just sign on again.

The following directions may only work under windows NT as I have not tested it on any other interface. Adjust if necessary. I am not describing every step in detail here; one goal of this assignment is to practice getting familiar quickly with complex, unfamiliar software. If things get wedged, just try again.

Getting Started

The Lexis-Nexis booklet can help you get started (but don't believe everything it says). Alternatively, the online help pages discuss much of the same information and are very useful.

Activities and Questions

For many of the questions below, reading the help pages under basic searching is useful.

    (1) What is the syntax for stemming of query terms? In other words, how can you compactly indicate assignments OR assigns OR assignment ?

Read through the help pages for Boolean searches. Read about Connectors, including changing connector sequences and combining connectors. (Connector is another word for operator.)

    (2a) What connector makes it easy to retrieve the various ways a name might appear?

    (2b) What is the difference, for Boolean queries, between connectors and commands in Lexis-Nexis?

    (2c) How are phrases indicated in Boolean search for Lexis-Nexis? Is there away to search on phrases of more than two words?

Lexis-Nexis has its own notion of precedence ordering. Read about Combining Connectors.

    (3a) What is the precedence ordering of the Lexis-Nexis operators (connectors)?

    (3b) In what order does the system interpret the following query?

      drug W/2 cure! W/5 disease OR malaria W/3 jungle OR rain AND forest

Read through the help pages for selecting sources.

Bring up a search dialog box (there are several ways to do this; one way is to click on the green B on the iconic menu bar). To get a list of sources you can click on the More Sources button from the Search window (yes, even if you don't have any sources selected yet).

Once you do this you can see a list of available sources. In the lefthand window navigate down into

Lexis-Nexis

    -> News
      -> By Industry & Topic
        -> Computing & Technology
Select the following folders in the righthand side and click on the "Add" button. You can't select them all at once, unfortunately. The interface seems rather picky about what you can do when.

    Computer/Communication News, Current (CMPCOM;CURNWS)
    Computer/Communication Archive News (CMPCOM;ARCNWS)
    Computer/Communication Stories (CMPCOM;ALLNWS)
    Computer News (CMPCOM;CMPTRS)
    Computer Stories (MARKET;CMPNWS)

(You can see the abbreviations for the source by making the sources window larger so you can see the information under Type, Library, and File more thoroughly.)

To get out of source selection mode click on "source info".

Under the Settings menu button, under the Searcher option, set the Save Searches option to "Save Automatically". This will allow you to more easily modify earlier queries.

Specify the following as your query in the Boolean query window:

    hal varian

You get an option to view as Full text or as KWIC (keyword in context). You can adjust this as you look at the results from the View menu. Adjust how large the KWIC window size is using Variable KWIC. (Read about KWIC and its variations in the help menus.) The Browse menu item lets you view more of the document.

    (4a) How many documents are brought back?
    (4b) Upon what criteria do the documents seem to be ordered?
    (4c) Can the ordering of documents returned for Boolean queries be adjusted by options in the system?

To get back to the search dialog box click on the magnifying glass icon in the menubar.

Modify your Boolean search by selected "Modify Search" from the Search menu, or by selecting the magnifying glass icon in the menubar. Modify the search with

    AND Microsoft

in the entry form that appears just below the menubars.

    (5) By how many documents does this reduce the search results?

Now select the freestyle search tab on the search dialog box. Issue the same query but in freestyle form:

    hal varian microsoft

You might want to view the results with SuperKWIC. This only works within freestyle queries and is supposed to pick the best paragraph within the document to show you.

    (6) How do these results differ from the first Boolean search?

    (7) Do a search in which you specify some kind of metadata (other than date) such as author. How did you specify it?

Now browse around the sources. Pick a subarea such as Public Records and find a source of interest within there.

For some sources, like the EASY;NEWS source in the Easy source directory, you don't have to specify a query. If you don't the system shows you an overview of the contents of the source, which you can navigate around by clicking on numbers.

    (8-10) For three queries/topics that are of interest to you:

      (a) State what your information need is.
      (b) State which source(s) you are searching on.
      (c) Specify the query in the Boolean format -- what is it?
      (d) How many documents did you get back?
      (e) Reformulate the query by using modify search (adding levels) If you got many documents back, try to reduce the number returned, otherwise try to increase the number returned.
      Did this work? If not, why do you think it didn't?
      (f)Now specify the more complex query in Freestyle format. Do the sets of returned documents seem different in Boolean vs. Freestyle? Can you describe how qualitatively?

    As part of one of the Boolean queries above, do something with synonym search. How did you specify it? What happened?

    Look at the ``segment'' facility. As part of one of the queries above make use of it in one of these queries. What did you do?

In general, get to know how to use as many search options as possible. Try to see how changing your query changes the kinds of results you get back.

Now check out the WWW interface to Lexis-Nexis. Note: this will only work if you access it from a machine within the UCB intranet.

    (11)Try to repeat one or more of the queries from above and then answer the following questions:
      (a) Name two ways the WWW interface is better than the GUI interface.
      (b) Name one way the GUI interface is better than the WWW interface.