Perseus Project
Laura Pope




Introduction


Digital humanities have come a long way since the late 1990’s when researchers such as Massey-Burzio posited that researchers in humanities disciplines did not find the concept of digital humanities relevant or useful to performing research (Massey-Burzio, 620,630-5). Today, with new technology and computing tools, many researchers in the humanities are embracing the concept of digital humanities. According to Katz (2005) digital humanities now allows researchers the opportunity to gather and manipulate data in ways previously unimaginable, even several years ago (Katz, 2-5). The Perseus Project, hosted by Tufts University, is an example of the development of digital humanities over the last decade. Perseus Project is not only one of the most extensive digital libraries in the discipline of humanities, but it also provides researches with powerful computations research tools.

A Description of the Perseus Project

Perseus Projects’ digital library contains both primary and secondary sources on Ancient Rome and Greece. The Perseus Project’s digital library also contains Arabic materials. In addition, the Perseus Project contains more than texts, it also contains visual materials, such as maps and visual materials dealing with archaeology and numismatics. However, the Perseus Project is more than a digital library. Perseus Project provides the user with a number of interactive computing tools in order to accomplish the mission of the Perseus Project.

According to Geoffrey Crane, editor-in chief of the Perseus Project, “Our larger mission is to help make the full record for humanity as intellectually accessible as possible to every human being, providing information adapted to as many linguistic and cultural backgrounds as possible” (Perseus Project website, n.d., www.perseus.tufts.edu/hopper/research). The Perseus Project accomplishes this goal not solely by the digitization of primary and secondary sources but by the utilization of current technology such as open source software and extensibility.


History of the Perseus Project


In fact, from its inception in 1985, the founder of the Perseus Project, Geoffrey Crane, understood of the necessity of using technology to increase access. Crane first envisioned the idea of the Perseus Project as a graduate student in Classics at Harvard. As a student at Harvard, Crane felt that he had access to the one of the best libraries in the world and decided to create a similar resource that would be accessible to users everywhere who were interested in the Classics (Wilson, 2000). Perseus Project was launched in 1992 as a CD-ROM based resource but later shifted to a web based library in order to provide wider accessibility to users (Ludwig, 2000). According to Perseus Project’s mission statement access is provided to the user through three categories of access (Perseus Project website, n.d., www.perseus.tufts.edu/hopper/research). This paper will examine these three access categories and how Perseus Project uses different computational tools in order to support these access categories.

Categories of Access


The first access category discussed in the Perseus Project mission statement is human readable information. This is the access category that most resembles a traditional, physical library. It is defined as all the items in the digital items in the library, including maps, images, illustration, inscriptions, and texts, “In this stage digital representations provide access to the physical senses of actual people in particular places and times” (Perseus Project website, n.d., www.perseus.tufts.edu/hopper/research). At this stage the user is not involved in digital humanities as there is no interaction between the user, the texts, and computational tools or software.
The second category of access described in the Perseus Project’s mission statement is called machine actionable knowledge (Perseus Project website, n.d., www.perseus.tufts.edu/hopper/research). These are the resources on Perseus Project’s website that allows the user to access the materials in the digital library in a meaningful way. These include the online catalog, encyclopedia articles, lexicon entries, and other structured information resources. It is access to these structured information resources that provide the user with the appropriate context for understanding the primary sources more fully. The Perseus Project’s mission statement provides a description of how this phenomenon works when the user accesses resources in the digital library:

Thus, if we encounter a page from a Greek manuscript of Homer, we could at this stage find cleanly printed modern editions of the Greek, modern language translations, commentaries and other background information about the passage on that manuscript page. If we moved through a virtual Acropolis, we could retrieve background information about the buildings and the sculpture (http://www.perseus.tufts.edu/hopper/research). Machine actionable knowledge allows the user to access not just the bare words of a text or the visual surface of an image but allow the user to place such items in an appropriate context.
The third and final category of access is described as machine generated knowledge. The Perseus Project mission statement describes this category of access as the analysis of automated information systems to generate new knowledge for the user (Perseus Project website, n.d., www.perseus.tufts.edu/hopper/research). This can be seen as an interaction between the human readable information and the machine actionable knowledge. When a patron accesses a text, for example, Caesar’s Gallic War and comes across a word she does recognize, such as facio, she can use a link to access a dictionary that will then parse the verb for her, thus generating new knowledge. As a library student and library user, I consider myself somewhat technology deficient so initially I wasn’t sure what to make of the paradigm outlined in Perseus Project’s mission statement. My initial response was that it might contain elements of academese. However, after re-reading and reflecting upon the mission statement and considering the technical terms more carefully, I believe that it is an apt description of the phenomenon that occurs when the user both accesses the Perseus Project digital library and uses additional resources provided on the website.
Perseus Project and Open Source


Perseus Project has developed its open source code as a way to pursue the goal of accessibility, “the project decided to build a new digital library system, designing it from the start to be modular, interoperable, and open-source.” Perseus Project was innovative in the development of a set of tools that allowed the user to experience the machine generated knowledge previously discussed. During the 2000’s when Perseus Project was developing open source software to achieve this goal, few digital libraries were providing users with similar contextual tools. Most digital libraries at this time focused on allowing the user to locate materials but did not provide the resources that would allow the user to interpret the texts (Wilson, 2000; Perseus Project website, n.d., www.perseusproject.tufts.edu/hopper/opensource).
Perseus Projects’ mission statement is supported by a number of open source services. Linguistic support is provided by Perseus’ Java Hopper code which is an open source application and an independent language that also provides support for Greek, Latin, and Arabic languages. A text in any one of these three languages is automatically linked to dictionary entries for words in the text and morphological analysis of the word, providing an inflected form of the word. The Java Hopper used by Perseus also allows the user to generate statistics on word frequency within texts and it also allows for lemmatization within texts (Perseus Project website, n.d., www.perseus.tufts.edu/hopper/opensource).
Word frequency and lemmatization are two important areas of research in the linguistics research (Hornblower, 1996, s.v Homer). For example, word frequency and lemmatization allowed researchers, such as Milman Parry to successfully argue that the Iliad and the Odyssey were each composed by one author. And it is these technique that allows researcher to continue to debate whether both works were written by the same author, known as Homer (Hornblower, 1996, s.v. Homer; Wyatt, 1999). It amazing is to think that research that was once painstaking is now accomplished with the click of a computer button.
The second open source application that Perseus Project provides for the user is contextualized reading:
Since the Hopper (Java) is the underlying code base for Perseus Digital Library, it reflects the same emphasis on being an integrated reading environment: much of its power derives from not simply from isolated textual services, but in the knowledge that that emerges from the interaction of the texts themselves (Perseus Project website, n.d., www.perseus.tufts.edu/hopper/opensource).
Basically, the Perseus Project does not just provide the user with texts but through the use of open source software allows the user to access resources that provide context for the primary texts found in the digital library. The Perseus Project offers the user a unique searching experience. Users are able to perform a variety of standard searches in any of the languages supported by the Java Hopper code but the user is also able to search texts and collections for all possible inflections of a word. This is an important search tool for inflected languages like Latin, Greek, and Arabic. Another feature is that the user is supposed to able to search by using standard abbreviation for the primary texts. But I found that a search for Hom. Hymn Dem. (Homeric Hymn to Demeter) did not actually bring up the relevant text in the eleven results I got. After viewing the tutorial, I tired several other standard abbreviations with similar results. Fortunately typing in the full title of the text will bring up the queried text. The Perseus Project has embraced other aspects of open source software applications that allow for an enhanced user experience. Extensibility was important design feature for Perseus Project:

On one hand, while the code is bundled with a collection of Greco-Roman and Arabic texts around which it has grown, users are able to include their own TEI-compliant XLM texts as part of the reading environment and enable the same services for those texts as those that are available online for Perseus’ open source editions (Perseus Project website, n.d., www.perseus.tufts.edu/hopper/opensource).

Users can request from the Perseus Project webmaster a download of the Java code used by Perseus Project and can use the application for texts not found on the Perseus Project digital library. Perseus Project designed the Java code used on the Perseus digital library with an emphasis on for modularity and extensibility. This function provides the user with the machine generated knowledge, such as the ability to calculate lemma and word statistics across the entirety of the collection and for these results to be indexed later for further searching.

Critique of Perseus Project


Perseus Project is an amazing resource for students, teachers, and researchers in the field of Classics or any other area of studies covered by the Perseus Project. As a novice in technological matters, I feel that the technological aspects of Perseus Project could be more clearly explained in the “Open Source,” “About,” and “Research” sections of the website. I think that these sections should be merged into one section so that the mission statement and the technological resources that support the mission statement can be more clearly connected. Perseus Project’s has clearly undergone an evolution from a CD-ROM, exclusive resource to an, from a CD-ROM to a widely accessible, open source digital resource. I think that more emphasis should be placed on this fascinating journey and how the various stages of Perseus Project brought the digital library closer to Geoffrey Crane’s original vision of a world class library that was accessible to students of the Classics and Humanities everywhere. The explanation of the technology used for the Perseus website could have been explained more clearly. While I found that the examples of the end results of using the technology were clear, I felt that some of the descriptions of the technology used for the Perseus Project digital library were so vague that they verged on repetitive. This was particularly noticeable when the various access categories were described in the “About” section. Consider the following example, taken from the “About” section of the website:

The hopper source code also includes a number of services for managing named entities such as people and places, and has served as the foundation for visualization projects, plotting that data both geographically on a map and historically on a timeline. In terms of modularity, the hopper also includes a number of low-level classes for manipulating text -- from finding all possible lemmas for a given Latin form to delimiting an accented Greek word (Perseus Project website, n.d., www.perseus.tufts.edu/hopper/opensource).

I felt that this statement had already been covered in the previously discussed categories of linguistic support, contextualized reading, and searching. Vagueness aside, the Perseus Project still combines the facets of a traditional library, (i.e. the ability to browse through texts and images) with computational tools that allow the user to analyze the texts and create new knowledge.



References

Hornblower, S. & Spawforth, A. (1996). The Oxford Classical Dictionary,
(3rd ed.). Oxford: Oxford University Press.

Katz, S. N. (2005). Why Technology Matters: the Humanities in the Twenty-First
Century. Interdisciplinary Science Review 30 (2), 105-118. Retrieved from EBSCOhost.

Ludwig, J. (2000). Site Offers Rich Experience in Classics and Ancient History.
Chronicle of Higher Education. 46(49), A50. Retrieved from EBSCOhost.

Massey-Burzio, V. (1999). A Rush to Technology: A View from the Humanists.
Library Trends 47(4), 620. Retrieved from EBSCOhost.

Perseus Digital Library Project. Ed. Gregory R. Crane. 10/13/2011. Tufts University.
Site accessed 10/15/2011 http://www.perseus.tufts.edu.


Wilson, S. (2000). Navigating Ancient worlds. Humanities 21(5), 18. Retrieved from
EBSCOhost.

Wyatt, W.F. (1999). Iliad. (A. T. Murray, Trans.). London: Harvard University Press
(Original translation published in 1924).