Rachel M. Heck and Sarah M. Luebke
Department of Mathematics and Computer Science
Grinnell College
Grinnell, Iowa 50112
heckr@ac.grin.edu and
luebke@grinnell.edu
Everyday, Educators and Students try to find new ways to enhance the learning process. In recent years, the World-Wide Web has become a major focus for learning enhancement. Unfortunately, The educational potential that the World-Wide Web has is not fully being used. Despite the fact that the Web is noted for being "interactive," the interaction is very basic in most cases. There is no way for a viewer looking at an arbitrary Web page to contribute to the information contained there, whether it be for their personal use later or the use of others who view that same page.
The ability to take notes, or annotations, on any page on the Web could be very useful to the educational process. Reading a single sentence on a Web page, which may or may not have been brought up in class, could spawn an intellectual discussion and/or raise valuable questions all starting in an electronic environment.
In July of 1945, Vannevar Bush published a paper called "As We May Think" in The Atlantic Monthly [Bush 1945]. His paper described an interactive information sharing device that, at the time, was little more than a dream. In the last couple of decades, this dream has come to be realized in the form of the World-Wide Web. Unfortunately, the exchange of ideas on the Web is primarily one-way. The author is given many tools for displaying whatever information that he may wish while the viewer is only permitted to interact by reading the page, clicking on links, or making bookmarks. "Man profits by his inheritance of acquired knowledge," states Bush. A two-way system where the viewer would have the opportunity to take notes on a Web page for later use or the use of fellow readers would allow for a greater exchange of ideas. As Ka-Ping Yee, designer of the CritLink Mediator, states:
An inaccuracy in a document that could have been corrected once by a reader must be noticed and re-corrected by each new reader; related supporting material to a document suggested by one reader must be re-found by each new reader; and so on, leading to incomplete information and wasted time. [Yee 1998]
A two-way information exchange would help solve this problem with the current structure of the Web. This type of exchange also leads to many benefits in education as it allows discussion to take place outside the classroom in addition to that which is conducted inside.
The solution to our problem is the instantiation of an annotation tool that can be used to make private, public, or shared annotations, or notes, on already existing Web pages. Annotation systems are meant to give electronic documents some of the same note-taking possibilities as paper documents. There are varying beliefs of what note-taking capabilities should be offered in an annotation system. Readers often take notes in the margin of a page or on a separate page. Another common form of annotation is underlining, highlighting, circling text or writing between lines. A different type of annotation is one meant to be shared with other readers, such as discussion-like comments or questions [Ovsiannikov 1998].
There have been many attempts to build systems whereby people could easily mark-up documents on the World-Wide Web but "there is nowadays no widespread annotation service" [Zohar 1999]. During the Summer of 1999, we began development on an annotation system that is appropriate for use by educators and students as a supplement to traditional classroom learning. This paper describes HyperPass, our own system, including both interface and implementation decisions, user testing results, goals for the future, and concerns that we met with during the development.
We are not the only people to attempt an annotation system [Heck and Luebke 1999]. One can learn a lot from looking at various examples of annotation systems. For instance, we came across several meanings of "annotation" and different styles of implementing such a program. One issue that seemed important was placement of annotations. Many programs allow annotations at one place per page. This does not seem sufficient for our purposes. Furthermore, we wanted to avoid overwhelming the original document, which often happens with annotation systems that insert the annotation text directly in the original text. In this section of the paper we describe what our system looks like and how a client interacts with it.

|
When readers access pages on the World-Wide Web (while running our plugin), they will see "Add," "Search," and "Help" buttons along the top of the document (See Figure 1). If the reader clicks on the "Add" button, a pop-up window will appear to add a new annotation (See Figure 2). In addition, if the reader happens to click on "Add" while the Add window is already open, there will still be only one pop-up window. The window will simply pop to the front and load the "Add an Annotation" page (See Section 3.5). Since the program already has the reader's name and email address, the only information the client needs to supply is:
For the permissions, annotations can be public, private or semi-private. If the client chooses semi-private, he/she must also choose the group(s) that have access to the annotation. There is also an option to include HTML in the text of the annotation. For instance, the author may wish to add links or list elements. They can do this by checking the "Treat text as HTML" box. |
![]() |
If there are annotations on the requested page, small arrows will surround the text that was annotated (See Figure 1). The arrow will appear yellow if the annotation is old (i.e., added prior to the reader's last viewing of that page), red if the annotation is new since the reader's last visit to that page, or green if the annotation is old, but there is a new reply. If the reader clicks on the arrow, a new window appears displaying the annotation (See Section 3.5). The Annotation window has a top and bottom frame. In the top section, the original annotation is displayed with buttons to reply to the annotation or close the window (See Figure 3). If the reader chooses to reply to the annotation, the window will display a modified "Add Annotation" page with text fields for the title and text of the reply. When the client has submitted the reply, the window displays this new annotation.
In some cases, there is a third button displayed in the top frame of the Annotation window. This is the delete button. Only the author of the annotation and the system administrator have permission to delete an annotation. This is the only time this button appears. In addition, when the delete button is pressed, if there are any replies to the annotation, the system will not let the client delete it until all replies have been deleted.
In the bottom of the Annotation window is a collapsible reply tree (See Figure 3). An arrow pointing to the right means there are replies to that annotation, and an arrow pointing down means the replies are already shown. To view a reply, the reader selects the title of that reply in the tree. The text of the reply will be displayed in the top frame, replacing the original annotation.
Something that we found lacking in many annotation systems was a search facility. As an annotation system is used, documents will collect more and more annotations. A reader who is looking for particular information might not want to wade through the messages but rather search for a phrase. When a reader presses the "Search" button in our annotation system, a Search window appears with a space to enter text (See Section 3.5). They are also asked to designate the search as keyword, exact string, or author. The keyword search will find all annotations with any of the words in the string, while the exact string search only brings up those matches that contain the entire phrase. The author search brings up only annotations written by a particular author.
HyperPass has been designed as a plugin to Ravel, a proxy server with the ability to keep track of account information and security [Kensler 1999]. The first thing Ravel does is ask the client to log in with a username and password. This allows us to maintain some control over the usage of our system. Moreover, Ravel can keep certain user information, such as preferences, real name or email address. Ravel then intercepts requests for pages and runs our plugin, along with any other desired Ravel Plugins, before returning the page to the browser.
HyperPass is based almost solely on the manipulation of strings. Since the system is also meant to communicate with the Web, HyperPass is almost entirely written in Perl 5. Javascript has been used for some actions, such as opening secondary windows, but the bulk of the the system is Perl. The following section is meant to give you a brief overview of our implementation. It is not intended to be exhaustive in any way but it should be able to give the reader a peek into our overall structure.
Note: HyperPass is a nonprofit product and, when complete, the source code will open for nonprofit use.
Each time a page is accessed, our main Plugin script is run. This script adds the "Add," "Search," and "Help" buttons to the top of the document along with the Javascript code needed to open the Annotations, Search, and Help windows. The script also gets any annotations for the page from the annotation server (See Section 3.2.1) and adds the colored arrows to designate an annotation. This is done by searching for the "context" of the annotated text in the document. The context is a few hundred characters surrounding the annotated text. Once the string is found, the arrows are inserted on either side of the annotated text, making sure to preserve original links. There is a potential problem if the original page is changed, since the context may not be matched, leaving the annotation homeless. Our goal is to incorporate an approximate string matching algorithm that will allow for minor changes. Currently, if the context of the annotated text is not found, the inserting script searches for the annotated text itself keeping in mind which instance of that text is the desired one. If the document changes significantly, the annotations will be kept in a separate place, and may still be accessible to those that care (See Section 5.3).
Before the arrows are added, the permissions on the annotation are checked against the user information. The arrows are added if the annotation is public, or the reader is the author or in one of the groups that has access to the annotation.
The time the annotation was made is also checked against the last time the reader accessed that page. If the annotation is old, (i.e., made prior to the last time the reader accessed that Web page), then a yellow arrow is inserted. If the annotation was made since the reader visited the page, it is new and the arrow is red. In the case of an old annotation with a new reply, the arrow is green.
The interface for adding an annotation makes it seem quite simple but the actual implementation involves many checks before any information can be gathered from the add annotation form. When the add button is pressed, a script is run to load an add page that is correct for a new annotation. This page knows who the user is and can get all of the appropriate user information as well as other pertinent information, such as the page being annotated.
Once the client has inputed all of the necessary annotation information and clicks on the submit button, several things are checked. First, a script looks through the HTML source code to see how many times the annotated text appears in the HTML. If it does not appear, it does not allow the annotation to be made. On the other hand, if the annotated text appears more than once (i.e it is ambiguous) another script runs. This script removes all of the links in the HTML source and then puts links in the form of superscript numbers next to each of instance of the annotated text. It also puts a short line at the top of the page that asks the client to pick the instance of the annotated text that they what to annotate. When the client picks an instance, another script is run that will store all of the necessary information for the annotation including up to 100 characters on either side of the annotated text as the context. If the annotated text was not ambiguous, it simply stores all the information for the annotation including the context, right at the beginning.
The annotation server is in charge of all of the annotation files. This system is networked to the main server using remote procedure calls. We are using a Perl module created to [Kensler 1999]. implement this. The calls are made in the same way that a normal subroutine call would be made but the request is sent over the network.
Annotation files for any one page are all stored in the same directory. This directory is created when the first annotation is made on a Web page. This directory name is stored in a file called index.txt. Each line of this file consists of a URL followed by a vertical bar and then the relative name of the directory. In this way, the directory for each page that has been annotated is stored for later retrieval.
Each basic, or non-reply, annotation on a page is stored in a file named #.txt, where # is simply a number that would make the file unique to the directory. We store the following information for each annotation made:
When a client views an annotation, several things happen. Javascript is used to open the Annotations window. If this window is already open, another window is not opened. This is a feature that we built into the system so as to cut down on the number of windows that can be open at one time. We did not want the client to get lost in the many windows that could potentially pop up (See Section 3.5).
After the window is open, a script creates a frame set for that window. There are two frames, each of which load another script. The top frame opens up the annotation file through the annotation server, gets the data from that file, and formats and displays that data. The top frame also knows who the person viewing the annotation is and other pertinent information in order to take care of replying and deleting in the proper manner.
The bottom frame loads the reply tree.
The reply tree is generated using a Javascript script originally created for a previously implemented Annotation System [Luebke, et all 1998]. This script takes a line like the following:
| MyTree = new tree({id:"tree-name",items:"['Item1','Item2',['Item2Child',AnotherChild'],'Item3']"}); |
This line is then turned into a collapsible tree using HTML's layers. The tree from above would appear in the following form:
|
|
The Javascript tree creation line is created by another Perl package. This package recursively checks to see if there are replies to the current annotation. If there are, it sees if those replies can be viewed by the current client, and if the client can view the reply, it adds it to the tree creation line. It then checks to see if that reply has any replies before it goes on to the other sibling replies. In other words, it creates the tree using a recursive, left-to-right, depth-first algorithm.
When a client clicks on the reply button of an annotation, a script runs that loads the appropriate add annotation Web page for a reply annotation. When the submit button is pressed, the annotation is stored. Reply annotations are stored in the same form as basic annotations. The fields of the annotation file that are not necessary for reply annotations, such as the annotated text, are left blank. This promotes uniformity and requires only one command to read and process an annotation file. Whether it is a basic annotation or a reply annotation will not effect the way that it functions.
A reply annotation file is named in the following manner. The extension is removed from the name of the annotation file that is being replied to. Next, _#.r is added to the end of the file name, where # is a number that will make the file name unique. In addition, a reply index is made in a file that gets its file address name from the annotation file being replied to with its extension changed to .ri. This reply index has one reply on each line. The lines are made up of the address of the reply annotation file, a vertical bar, and the title of the reply annotation. The reply index allows the reply tree to be created very easily in the recursive tree creation script (See Section 3.3.1).
The following is an example of what files would be contained in a directory for a typical Web page with 1 basic annotation that has 2 replies.
As it says in Section 2.2, if the client viewing an annotation is the author of that annotation, a delete button is present in the top frame of the Annotations window. When this button is pressed, a script generates a confirmation page that includes the title and text of the annotation in order to make sure that the client really wants to delete the annotation. If they choose not to delete the annotation, that annotation and the entire reply tree are redisplayed. Yet, if they do press delete, a second script runs a check to make sure that that annotation does not have any replies to it. If it does, the annotation can not be deleted until those below it are deleted. This is an attempt to stop broken reply threads. A reply to an annotation that does not exist is not worth much.
Another useful feature is that the System Administrator, using the username sysadmin, can delete all annotations, as long as the annotation does not have any replies. In which case, the System Administrator can simply delete all the annotations below that one, since they have full rights to, and then delete the annotation.
Allowing the System Administrator to delete all annotations is an important feature. This can help to stop obscene messages as well as allow clients who really want to have a message that they wrote be deleted despite the fact that there have been replies made to it. This would, of course, only happen if the client contacted the System Administrator with their concerns.
One problem with allowing the sysadmin user to delete all annotations is that they can also view all annotations, even private ones. This is not too much of a problem but ALL clients of HyperPass must be informed of the possibility that their "private" annotations could potentially be seen by the system administrator. It is also important that the clients realize that these private annotations will not be deleted or altered in any way unless an entire reply branch was eradicated. They will be treated as if they don't exist except in the most extreme situations.
The need for a search engine was clear, but it was not obvious what kind of search was needed. After some research, we decided to implement a keyword search, an exact string search and an author search. A keyword search allows clients to easily find annotations about certain topics. An exact string search is useful to quickly locate more specific information. Finally, an author search permits readers to find annotations written by a specific author.
When the reader clicks the "Search" button, a new window will open containing a form with a text field and radio buttons to choose "Keyword", "Exact String", or "Author" (See Section 3.5). After entering a string and choosing the search type, a Perl script is run. This script first checks whether the search should be keyword, exact string, or author. If it is a keyword search, the string is broken into the separate words. Then each annotation for that page is searched. If a match is found that is not a repeated match, the title and line matched are displayed, along with a link to view the complete annotation, this Annotation is displayed in the Annotations window if one is open. Otherwise, it opens the Annotation window to in which to display the annotation (See Section 3.5).
While using HyperPass, there will be no more than four windows open at one time. First, there is the main browser window. This window usually displays the original Web page that the client views. It is also the target of any links that are in an annotation. The second window is the Annotations window. This window is where annotations are displayed and added. The third window is the Search window. As is clear by the name of the window, this is where the client searches the annotations. And the final window is the Help window. This window pops-up when the "Help" button is pressed. It simply describes how to use different parts of the annotation system.
We are the first to admit that our interface has biases. Some decisions are made due to ease in implementation. Others came from personal design biases. While we have tried to build a user-friendly, intuitive interface, the software development industry has repeatedly shown us that the programmers need user input for design issues. This section is dedicated to our attempt at gathering user comments and suggestions.
We asked college students with a variety of computer experience to use the annotation system. They were first given a short introduction to using the program, then directed to a sample annotated page to get started. Once they became comfortable, they were allowed to browse the Web while experimenting with the annotation system. We asked each subject to fill out a survey when they were finished. We asked questions such as did they like the look of the interface, was it easy to use, was it intuitive and what would they do differently if they designed their own system.
We received many useful comments from the testing. Some of the suggestions were quickly and easily implemented. For example, a few subjects viewed annotations with links (<a href> tags) in the text. When they selected the link, the page opened in the top frame of the Annotation window they were viewing. They commented that it would be preferable if the link opened in the main browser window. Another complaint was that when clicking on the button that brought up a window, that window did not come to the top (i.e., in front of all other windows). As was, the user sometimes did not know a window had been opened. Both of these suggestions were easy to implement and were done so immediately.
We also received a great deal of feedback that dealt with the interface as a whole. Some subjects expressed an interest in having more control over the "look and feel" of the system. More precisely, they wanted the ability to change the color, font, size, etc. of the text, windows, and buttons. Another person suggested that the annotations be inserted directly in the document instead of viewing them in a pop-up window. These comments will likely be implemented as options, or preferences, that can be supplied by individual users (See Section 5.1).
HyperPass is not complete at this time. Though the system works, there is still a lot to be done. Due to the amount of time that we had to initially implement the system and do some beginning testing, we were unable to implement the following features.
We wish to allow clients to be able to customize HyperPass. This would allow only specified annotations to appear as well as allow the client to specify certain annotations that should not be displayed. For example, the client could decide to view only annotations made by John Doe or even view all annotations except for those by John Doe. Or the client could view just the annotations made for the group "Workers" or only his private annotations.
Another thing that will be set in the preferences is the default security setting for new annotations. As is, the default annotation security setting is public.
We will also allow users to display annotations on the original document as opposed to a separate window mainly for purposes of printing the document with all of its annotations in a single shot. We have already developed support for this but the actual ability to turn it on or off has not been implemented.
Currently our search function is very basic. We intend to add greater functionality as the program continues to be developed. For example, the search could be expanded to take boolean expressions (and, or, not).
Similar to the search capability, we would like to add a sorting function. This would allow readers to view a list of annotations according to a particular criteria (e.g., sort annotations by date, author or, possibly, subject).
We have taken a few precautions when storing and inserting annotations so that page changes will not mess up existing annotations (See Section 3.1). Yet, we have not gone as far as we wish. The use of an approximate string matching algorithm when we insert annotations will be our next goal in this realm. We would also like to develop a system that will keep track of annotations that really could not be inserted into the document, if the document has completely changed from top to bottom, so that these invisible annotations can be dealt with in an appropriate manner.
If the technology ever becomes available, we wish to allow users to highlight the text they want to annotate and then press the annotate button. This technique will make the program even easier to use without confining it to only certain browsers (Internet Explorer (IE) supports grabbing highlighted text at this time but it is only an IE command and would therefore not be supported by Netscape. One of our main goals is to make our annotation system available to most people so it should be compatible with IE and Netscape, the two most popular browsers, at the very least.)
We feel this annotation system offers great possibilities for giving readers more freedom on the World-Wide Web. Annotations could be useful to keep private notes, share information, pose questions, or spark discussions. Despite these benefits, there are potentially many concerns, both technical and social, that arise. We will present some of those in this section of the paper.
One feature of the Web is the flexibility it provides. One of the server-side features is the ability to have multiple URLs leading to the same page (e.g., an optional default.htm or index.html). This led to problems in the annotation system, since it requires checking if two pages contents are the same instead of just their URLs. Checking the content for each page would take a considerable amount of time and bandwidth.
In GVU's 10th WWW User Survey, 60% of more than 3000 participants
cited speed as one of the biggest problems when using the Web. [GVU
98, http://www.gvu.gatech.edu/user_surveys/survey-1998-10/graphs/use/q11.htm]
Unfortunately, adding content to each page, as the annotation system
does, increases the time it takes to display a page. One possible
solution is
allowing the client to turn the loading of annotations on and off
easily (See Section 5.1). Another possibility
for the future is
permitting the caching
of annotations, along with pages.
Since the annotations are associated with text from the document, we run into trouble if the document changes (e.g., the author of the page corrects the spelling or a word or updates a date). The annotation system should be smart enough to recognize that the text is close enough to the original and, therefore, the annotation is still inserted. This may require an approximate string matching algorithm and various checks (See Section 5.3).
Many of the concerns surrounding annotation systems are social, rather than technical, such as factoring in human nature, the law, and moral obligations.
The ability to make annotations on Web pages that other people may have written brings up many interesting copyright issues. Does this violate an authors right to determine what appears on his/her document?
During the development of HyperPass, we have discussed this issue in great detail. A couple precautions have been taken to help calm the minds of those who do not look kindly on the idea of having their Web pages annotated. First, all of the annotations on a particular Web page are clearly marked. The possibility that a client who views the page would think that the annotation is actually a part of the a Web page itself it very small. In addition, all annotations have the potential to be deleted by the system administrator. This could be very useful in the case of defacing, obscene, or otherwise inappropriate annotations.
A very powerful option that has been included in the implementation of Ravel, the server that runs the HyperPass plugin, is the ability of the author of a Web page to include a meta tag in their document that will disallow plugins to be run on their page. This meta tag can also be used to turn off only certain plugins, or even disallow some specific user of group of users from running plugins on their page. In this manner, we allow authors to control their documents.
In addition to precautions, we also discussed the idea behind an annotation system and whether it seemed to violate copyright laws itself. Our personal idea of what an annotation system should be is something that allows clients to make marks on electronic documents just as they can on paper ones. The right to write and share paper documents with others has been very useful to the educational world. Students are encouraged to take notes in the margins or near the text. The annotation system does little more than this.
In addition to the similar mark-ups that are done with paper documents, there seems to be precedent in past court cases to allow a work to be changed and immediately displayed as long as it is not in permanent form. In Lewis Galoob Toys, Inc. v. Nintendo of America, 964 F.2d 965 (9th Cir. 1992), it was determined that "[a] derivative work must incorporate a protected work in some concrete or permanent 'form'" [Lewis Galoob Toys, Inc. v. Nintendo of America, Inc. 1992]. HyperPass, like Galoob's Game Genie, the product in question in the 1992 case, is of no use without the underlying Web page itself and does not copy the original document. It simply alters the document data before displaying it. This altered data is never fixed.
The beauty of an annotation system is the capability it provides to leave "notes" on arbitrary Web pages. Unfortunately, this could lead to abuse by SPAMmers. We plan to include filtering services to block annotations from certain users (See Section 5.1), but this may not be enough. Another idea is to incorporate a democratic rating system, whereby other users decide what annotations are worthwhile.
Another potential problem is content that may not be acceptable for
all audiences. Content filtering may be an option to detect R-rated
material [ComMentor
http://www-diglib.stanford.edu/diglib/pub/reports/commentor.html].
After a few months of development, HyperPass is well designed and effectively meets its goal of creating an educational learning environment on arbitrary Web pages. The initial testing of the product has brought up several new ideas that will be implemented in the future as well as several comments that were quickly incorporated into the project.
We graciously acknowledge Professor Sam Rebelsky for his guidance throughout this project and Andrew Kensler for his assistance. In addition, we thank those students who took the time to test HyperPass. This work was supported by Grinnell College, the Grinnell College Noyce Science Summer Research Fund, and the Robert N. Noyce Faculty Study Grant.