HumanSaga - "Plotting the course of history"

HumanSaga Development Project

SIMS Masters Project
David Chott & Jonathan Henke

May 16, 2000


Introduction
Analysis of Existing Systems
Choice of Technology (Constraints)
Intellectual Property Agreement
Overview of the Development Process
     User Needs Analysis: Process, Conclusions
     Low Fidelity Prototype
     User Testing: Process, Conclusions
     First Interactive Prototype
     Heuristic Evaluation: Process, Conclusions
     Second Interactive Prototype

Database (Functionality)

Design of Categorization Structure
Object-Oriented Programming

System Interface and Functionality
     Exploration
     Intergation of Searching and Browsing
     Related Timelines
     Inline Links
     Hierarchy Bar
     Previous timelines
     Merged Timelines
     Graphic Design
Future Development


Introduction (^ top)

HumanSaga is a worldwide web site with a database of historical events, which can be used to explore history by creating and interacting with dynamic historical timelines.

The HumanSaga system achieves a number of solid innovations to the simple timeline by viewing them as sets of separate, individual events. The advances stem primarily from this realization that events should be treated individually and assigned individual metadata as an aid to retrieval. This is a departure from the standard practice of dealing with timelines only as pre-coordinated packages of events which can only be interacted with as sets of events. This change in perspective breaks the traditional document retrieval model because in this new view, retrieved "documents" are really composed of many related smaller documents (i.e. the events themselves). Users of HumanSaga, rather than seeking a handful of relevant documents, instead create meaningful groupings of related documents -- timelines of events. Although in some situations the user will be interested only in a single event document, the desired results in the majority of situations will be sets of events that are meaningful only when viewed together. This naturally follows given the linear and causal nature of history.

Consequently, HumanSaga has substantial added value over existing timeline systems that make events available only in pre-packaged sets which may or may not meet the users' needs. The metadata assigned to each event in HumanSaga provides for their retrieval on the basis of when they happened, where they happened, what subject (literature, politics, etc.) they deal with, and finally any keywords appearing in the event. In addition to this powerful functionality, the interface to the system offers users numerous access points to clarify, expand and narrow down their timelines without having to do a new search. The system was developed and refined through an iterative cycle of development and user testing.

Analysis of existing systems (^ top)

Before designing and implementing the system, a wide survey of timelines on the Web was conducted. Paper-based and some non-Web computer-based timelines and systems were evaluated as well. The survey resulted in several general observations. Most importantly, all examples were static timelines stored in text or HTML files rather than in a database. Although some examples include navigation in time, it was never anything more sophisticated than browsing from static page to static page usually broken apart by decade. Hyperlinks might be present, and they would typically link to a narration of some kind about the person or thing made into a link. Only rarely would a timeline site provide a search facility, and even then it was only a keyword search across the static pages on the site.

From this research, it seemed putting events into a database would be something of a revolution in this area. All of the observed approaches used static lists of events as the smallest unit of retrieval. Being able to search and retrieve events using a system which treats each event as the smallest unit of retrieval would allow for much more interesting possibilities. When each event is treated as a single unit, metadata can be assigned to each event to provide for retrieval based on any combination of event-level criteria such as date, location, etc. That is the most fundamental advancement the HumanSaga system contributes to this space and this difference is what makes most of the system's innovative functionality possible. Timelines can now be built on the fly composed only of events matching the user's criteria -- permitting an almost unlimited number of ways to view events in the system.

Choice of Technology (Constraints) (^ top)

Implementation of the designed system required selection of specific software packages. In addition to selecting which platform to use, the choice of Web server, DBMS, and scripting language had to be made. The constraints influencing this selection process were several, but cost and difficulty of use were the primary factors. Cost considerations almost immediately excluded using Microsoft Windows and other Microsoft products. Not surprisingly, Linux made the most sense for the operating system platform because it is free and runs a lot of powerful software. Apache was the natural choice for a Web server because it is also free and has proven to be both powerful and reliable. The remaining decisions about the database and programming language were not as immediately obvious. Rather than conducting a lengthy research and comparison process to select this software, an expert was consulted for advice. The MySQL database server matched with the PHP scripting language was recommended as being sufficiently powerful to meet the project requirements while also being free. Follow-up research showed that MySQL would indeed meet the demands of the database-reliant HumanSaga application. PHP is an exceedingly popular scripting language to use with MySQL, and the pair have been thoroughly tested together in various production environments.

Neither MySQL nor PHP were supported at SIMS when this project was initiated, so an old Pentium 166 desktop computer was set up as a Linux server running all the necessary software and connected to the campus network from the International House dormitory. With the exception of a complete reinstall following the crash of a faulty hard drive (on Valentine's Day!), the server and software have performed flawlessly. We have been exceedingly pleased with the power and flexibility that these free, open source software packages have provided.

Intellectual Property Agreement (^ top)

One of the challenges we faced was reaching an agreement about the ownership and intellectual property rights of the HumanSaga system. We entered the project with different goals. David hoped to continue developing the system after graduation and to implement a fully functional online version; Jonathan was not interested in pursuing the project after graduation.

After several discussions, we reached an agreement which should satisfy the needs of both of us. A copy of the contract may be available upon request.

Overview of the Development Process (^ top)

User Needs Analysis (^ top)

Process: We interviewed two educators (an elementary school teacher and a high school/middle school social studies teacher) and one student (a university undergraduate student). The interviews covered: computer skills/access at different grade levels; the nature and emphasis of history education at different grade levels; the amount of independent research required of history students.

Conclusions: The Needs Analysis process indicated that most students at most grade levels would have the computer access and skills necessary to use HumanSaga. However, it also suggested that, for several reasons, HumanSaga will not meet the primary needs of history students and educators. It instead refocused our attention on the use of HumanSaga as an auxiliary tool for students studying other domains. Based on our needs analysis findings, we developed three scenarios to represent typical user search tasks and processes.

Detailed Needs Analysis Summary and Conclusions (MS Word format)
Scenarios (PDF)

Low Fidelity Prototype: (^ top)

We created a paper ("low fidelity") prototype of our interface, with 17 different screen displays, including the front page, advanced search, multiple timelines (search results), saved timeline page, pop-up windows used in the selection of thesaurus terms, pop-up error messages, and pull-down menus.

Low-Fi Prototype (photos)
Low-Fi Prototype (sample page)

User Testing: (^ top)

Process: We tested our paper prototype with four sample users using the "think aloud" method. Each tester completed three tasks (modified from our scenarios), a short written post-test questionnaire, and a brief, open-ended post-test interview.

User Testing Script (PDF)
Pre-Test Questions (PDF)
Post-Test Questionnaire (PDF)

Results & Conclusions: We made a number of interesting observations while observing participants' attempts to complete the scenarios:

We also compiled the results from the post-test questionnaire. The results were as follows:

Question
mean ± SD
(disagree=1; agree=7)
(1a) I found the information that I was looking for. (scenario 1)
7.0 ± 0.0
(1b) I found the information that I was looking for. (scenario 2)
6.5 ± 0.9
(1c) I found the information that I was looking for. (scenario 3)
6.5 ± 0.5
(1) I found the information that I was looking for. (all 3 scenarios)
6.67 ± 0.7
(2) I learned new things.
6.5 ± 0.5
(3) It was immediately obvious how to use the system.
5.25 ± 1.9
(4) Over time, I learned more about how to use the system.
6.25 ± 0.8
(5) I had trouble figuring out how to achieve my search goals.
4.33 ± 1.7
(6) I was sometimes confused about why the system did what it did.
3.5 ± 1.5
(7) The timeline display was clear and understandable.
6.25 ± 0.83
(8) I felt frustrated by the system.
3.25 ± 1.79
(9) The system was enjoyable to use.
5.75 ± 0.43
(10) If it were available on the web, I would use a system like HumanSaga.
6.5 ± 0.5

We were encouraged by participant responses regarding the informational content (Q 1 & 2) and the timeline display (Q7). Interestingly, though, even participants who failed to locate the correct information expressed a high degree of (false) confidence in their results!

Questions 3 and 4 were designed to get information about the learning curve; the results were inconclusive (and a bit puzzling). One participant ranked Q4 much higher than Q3 (2 vs. 7), indicataing a slow learning curve; another was the opposite (7 vs. 5), indicating a quick learning curve which leveled off; the other two ranked them the same, indicating a steady level of learning.

The most troubling responses were to Q5, 6, and 8; users had difficulty achieving their goals, got conflicting feedback from the system, and (not surprisingly) experienced frustration as a result.

We believe that many of these problems can be overcome by improving labelling and providing more helpful explanations of the system's features. In addition, we plan some changes to the navigational and architectural elements of the interface to better match users' task needs and mental models.

First Interactive Prototype (^ top)

In response to user testing, we made the following design changes to the user interface:

Not all functionality was fully implemented in this prototype:

Heuristic Evaluation (^ top)

Process: We recruited five heuristic evaluators to review our prototype. All five were SIMS masters students; four had training and experience in heuristic evaluation. We provided the evaluators with online instructions and description of the system. In addition, we provided a list of Jakob Nielsen's 10 usability heuristics and references for more information.

View instructions for evaluators

After the evaluation was completed, we compiled all of the comments and evaluations, combined problems noted by multiple evaluators, and grouped them by different sections of the interface.

Conclusions: The five evaluators identified 34 heuristic violations or concerns; seven were rated 3 ("major usability problem") and one was rated 4 ("usability catastrophe"). All of problems ranked 3 and 4 have been addressed in the most recent prototype, along with most of the lower-ranked concerns.

Only five problems were noted by more than one evaluator; two of those were noted by three evaluators.

Interestingly, several of the comments proposed alterations that would be mutually exclusive. (On search results page, for example, one evaluator said to move search criteria to the upper left corner, while another said to move them down nearer the top of the results.)

View compiled results

Second Interactive Prototype (^ top)

This prototype includes real functionality which was only mocked-up in the first prototype:

In addition, this prototype incorporates changes in response to the Heuristic Evaluation results.

View second interactive prototype

Database (functionality) (^ top)

The database contains eighteen tables, eleven of which provide for system functionality while the remaining seven tables serve administrative purposes such as tracking changes, crediting sources, managing projects, etc. The system tables store the events and a set of highly structured metadata about each event. Additionally, these tables include a categorization hierarchy to which events are assigned. Metadata categories for notable people or things have additional descriptive information which is displayed to the user browsing that metadata category. For example, a person's birth date, birth place, nationality, etc. are included. Finally, one table links events in this database to the corresponding location where they took place in a separate database of geographical information. This separate database, the Thesaurus of Geographic Names, has been licensed from the Getty Institute.

View entity-relationship diagram

Design of Categorization Structure (^ top)

We searched for different existing topical categorization structures tailored for the subject domain of history; by comparing the strengths and weaknesses of different approaches, we hoped to be able to recommend one approach as most appropriate for the categorization of HumanSaga events. Based on our findings, we proposed a faceted categorization structure, with separate facets for geography, chronology, and subject. The division of subject categories was the most tricky. Based on similar categorization structures, we attempted to devise a high-level division of categories that was truly independent of time and place, and which represents topics common to all ages and cultures. At lower levels, some categories may be more closely related to times or places. Both Excite and Yahoo! provide fairly good examples.

The proposed high-level category divisions are shown on the home page of the current prototype, although most of them currently have no events classified under them.

"Categorization of Historical Events: comparison of existing techniques and proposal for a redesigned structure", prepared for IS 245. (PDF, 29 kB)

Object-oriented programming (^ top)

The single most important lesson learned on the development side of this project was the usefulness of object-oriented programming. More than 75% of the time spent coding the scripts which link the database to the Web was carried out using non-OOP techniques. This development was often tedious and slow because of all the planning and coordination required to work with the system components using complex sets of variables. Once OOP techniques were adopted, the development productivity accelerated by at least a factor of ten. Although OOP was not used until three fourths of the way through the project, much more than half of the system functionality is supported by object-oriented code; most of the older code was rewritten using OOP classes. It was far easier to conceptualize implementation of system specifications and to carry them out using OOP because of the way it encapsulates the details of how methods work away from code which must interact with them. Modification of different components was less-involved, because the well-defined interface between components using method calls insured that changes effected in one component did not interfere with other components. New functionality in the form of new methods was similarly easy to implement. The OOP process proved to be far less mentally taxing and far more fruitful for a given task than using less sophisticated programming techniques.

System Interface and Functionality (^ top)

Exploration: We designed the HumanSaga interface to facilitate exploration, using an open-ended model of information retrieval (inspired by Bates' "berry-picking" metaphor), rather than a closed-ended, goal-oriented model. Our basic idea was that each timeline display should suggest directions for further exploration. The display should provide multiple access points, and the user should always be able to move forward from the current display, rather than returning to a home page.

Example: (Try it yourself!) A user is looking for information about John Steinbeck. From the front page browse section, she selects "Authors" and then "John Steinbeck". She notices that Steinbeck won the Nobel Literature Prize in 1962; she clicks the link to get a timeline of the Nobel Prize. In this display, she sees that George Bernard Shaw also won the Nobel Prize (in 1925), and clicks his name to get an overview of his life. The blurb about Shaw notes that he was born in Dublin, Ireland; she clicks on Dublin to get its timeline. Noticing that several famous authors were born in Dublin around the same time, she goes to the search box at the top of the screen and looks for a timeline of all events in Europe between 1850 and 1915.

Integration of Searching and Browsing: One specific design goal to facilitate exploration was the seamless integration of searching and browsing. Either technique may be valuable at different stages of the exploration process, and we wanted users to be able to easily alternate between access methods without necessarily being aware of the difference.

The two displays are graphically very similar; both include "Previous Timelines" and a new search box on the top. While the search results page repeats the search criteria that resulted in the timeline, the browse results page lists the name of the metadata category retrieved and a brief abstract about the item. (For people, for example, it includes birth and death dates and nationality.) Currently, the search results page also includes "Related Timelines"; in the future, that element will also appear on browse pages, further unifying their designs.

Related Timelines: One of the key features in our interface is the portion titled "Browse Related Timelines". It was clear to us that a crucial way to facilitate exploratory searching was by dynamically suggesting timelines related to specific search results.

This can have two types of benefit. Since the search algorithm includes free-text searching of the event descriptions, the resulting timeline may not perfectly target the desired information (either omitting events or including extraneous ones); "Browse Related Timelines" should allow users to more accurately target their desired results. "Browse Related Timelines" will also suggest jumping-off points for further exploration, perhaps identifying topics which are slightly broader or slightly narrower or otherwise similar to the retrieved timeline.

Example: (Try it yourself!) Searching for the term "space" retreives mostly events from the space race, although it also includes a few unrelated events that include the word "space" (such as Arthur C. Clarke's book Prelude to Space) and omits a few relevant events that do not (such as John Glenn's orbitting the Earth in 1964). The first entry under "Browse Related Timelines", however, is "U.S. Space Exploration", which should more accurately target the user's desired information.

Example: (Try it yourself!) A user does a search for "Rosa Parks". The system retrieves two events, but suggests that the user might also be interested in timelines of the Civil Rights Movement or Martin Luther King, Jr. (as well as their parent categories, "Political Movements" and "Other People").

Related timelines are generated completely automatically at search time. In the database, each event has been classified into one or more metadata categories based on the people, places, and subjects involved. When a set of events is retrieved, the system identifies the three most common metadata categories among the resulting events, as well as the parent of the most common category. The algorithm also makes allowances for cases when fewer than three categories are represented. The resulting metadata categories are then presented to the user under "Browse Related Timelines".

Inline links: An additional access point for further exploration is through the use of inline links. When an event description refers to a specific person, place, or thing represented as a metadata category in the database, that reference will be a link that retrieves all events and supplementary information about that item.

Hierarchy bar: When browsing a specific metadata category, the display includes both a brief description of the item and a listing of its location in the hierarchy structure. William Shakespeare, for example, appears in "Literature > Authors > William Shakespeare". Although a single category may have multiple parents, the hierarchy listing is created using only the preferred parent. (Shakespeare, for example, could also be found under "People > Authors > William Shakespeare".)

Because each item in this hierarchy display is a hyperlink, it can be used as an additional exploration access point, allowing users to create new timelines by traversing the hierarchy itself.

The HumanSaga metadata categorization structure includes two separate hierarchies -- one for people and things and one for places (a faceted classification design). The place hierarchy is part of the Thesaurus of Geographic Names and includes an enormous number of places, many with detailed descriptive information.

Previous timelines: Another key navigational feature is the accessibility of the user's search history. The low-fi prototype included the search history on a separate page (labelled "History") (View sample low-fi page) . Because it was invisible, this feature was seldom used in low-fi testing. The label was also misleading; one participant asked, "'History'? Isn't it all history?"

The current prototype displays the six most recent timelines on every page. Both search results and browse results are listed. Browse results are accessible via a simple hyperlink; because of technical limitations, search results may be retrieved only by clicking a check-box and hitting the "Submit" button. Zero-results searches are not saved. Tests have indicated that the easy accessibility of the previous timelines greatly improves the ease of use, and is often used to retrieve and merge already-seen timelines. In the future, we plan to list the eight most recent timelines (rather than six) and have older timelines accessible from a separate page.

In the creation of the "Previous Timelines" display, short names are automatically assigned to each previous timeline. Names are limited to 14 characters, and are derived from the category name for browse results or search criteria (what, when, or where, in order) for search results. Different timelines which would be given the same short name are differentiated with numerical suffixes. Anecdotal evidence suggests that these short names are sufficiently descriptive for users to recognize their previous timelines.

Merged timelines: A new feature supported in HumanSaga with no parallel (to our knowledge) in existing timelines is the ability to merge separate timelines. A merged timeline can easily be created be selecting two or more previous timelines and clicking "View Results". (Currently, however, only two timelines can be merged.) The resulting display intermingles the individual timelines, marking events with distinct bullets (with different shapes, colors, and numbers).

This information-rich display facilitates analysis of historical trends at a glance. One can easily see which events in one timeline preceded events in another. One can also see which events are listed in more than one of the merged timelines.

Example: A timeline of scence fiction literature (from browsing the "Literature" categories) is merged with a timeline of U.S. Space Exploration (under "Exploration & Discovery"). The results show a series of Science Fiction books about travel to the moon (including From the Earth to the Moon, 1895; The First Men in the Moon, 1901), which abruptly end in 1969 after the Apollo 11 mission. Later science fiction focuses on topics such as robotics (The Bicentennial Man), extraterrestrial biological contamination (The Andromeda Strain), and genetic engineering (Jurassic Park).

Graphic design: In our Low-Fi prototype, the "Related Timelines" feature was titled "More About..." (View sample low-fi page) Based on User testing, we renamed the feature and altered its location on the page. We designed the layout of the timeline page so that the new search form ("New Timeline") and "Browse related timelines" echo the Search vs. Browse layout of the frontpage, just as these two functions echo the search and browse functionality.

As discussed, we also moved the "Previous Timelines" onto both the search and browse results pages. Both this feature and a fresh search box are available from every page, including "Help", "About HumanSaga", and intermediate category browsing pages.

Both our first round of user testing and our heuristic evaluation indicated that some users had trouble getting oriented to the site and needed more context -- infomation about the contents of the database and what to expect when searching. A redesign of the front page attempted to address some of these concerns. We added a slogan under the main title on the front page ("Plotting the course of history") to emphasize the historical content of the database and suggest a map or chart-like metaphor for timelines. We also added a graphic on the left with a short descriptive blurb ("Explore history with dynamic, personalized timelines") and a graphic representing a timeline (a progression of years), and including historical-looking images of a variety of people who might be represented in the database. (The images are of Julius Caesar, William Shakespeare, Frederick Douglass, Charles Darwin, Mark Twain, Edith Wharton, Lucille Ball, and Nelson Mandela.) This graphic itself is an access point, since an individual image can be clicked to retrieve a timeline of the person's life. (Currently only Shakespeare and Twain are functional.)

Future Development (^ top)

TGN term selection for searching: Currently, users can only limit their searches geographically using the five continents and a list of major countries. The licensed TGN from the Getty, however, can facilitate searching by county, city, neighborhood, etc. In the future, it would be valuable for users to be able to search or browse this thesaurus of places to find more specific locations to use in searches. A prototype interface was designed for the low-fi prototype, but not yet implemented online. Term-selection would be implemented with a popup window displaying places matching the user's search term so they can select the desired thesaurus term. Once they select the desired term, the popup would close and the original page would be reloaded with the new term appearing in the "where" field of their search.

More detail - Less detail: Although the database holds the detail, or significance, of each event (on a scale from very important to mere minutiae), users currently have only limited access to this metadata. We plan to add a "more detail" and "less detail" link on the right side of the search results title bar. The less detail link would permit users to decrease the number of events in a timeline by filtering out the less significant events while more detail would back off on this filtering to display more events. These two links will move up and down a five level spectrum of significance.

Other future development plans:

Thanks to everyone who has helped with our project:
Kristin Bliss, Julie Brown, Carlos Carvallo, Joe Chott, Rachna Dhamija, Dale Dougherty, Jennifer English, Ana Paula Gouvea Costa , Marti Hearst, Natasha Jacob, Harry Jacobs, Kathryn Kada, Jamie Laflen, Ray Larson, Kim Norlen, Chaitee Sengupta, Masako Sho, Kirsten Swearingen, Amity Zeh