Summer 2024
Digital Ker
Project Lead: Elaine Treharne
Stanford Text Technologies, with the support of Stanford’s Center for Spatial and Textual Analysis and the Vice Provost for Undergraduate Education, has been working since 2022 on producing an updated, and now digital, version of Neil Ripley Ker’s Catalogue of Manuscripts containing Anglo-Saxon, which was published by Clarendon Press in 1957. Originally intended for Oxford University Press, this is now an Open Access resource: manipulable and quantifiable data is augmented with additional material on Neil Ker himself, on his publications, his previously unpublished archival sources, and contemporary scholarship that has advanced Ker’s foundational work.
Project Members
Project Team
Elaine Treharne
Professor of English
Charlotte Cao
Undergraduate Researcher - Summer, 2024
Exhibiting and Visualizing Neil R. Ker’s Works at Stanford
As an intern under the “Text Technologies” project, I contributed to prior research on Neil R. Ker’s (1908-1982) Catalogue of Manuscripts Containing Anglo-Saxon. Ker, renowned for his contributions to the field of Medieval paleography, created this book as a guide for all Early English literature written prior to the thirteenth century; each entry describes the contents and physical characteristics of a different Medieval manuscript containing English, and, given its comprehensive nature, this book is integral to modern scholarship. That being said, this book is not easily accessible to the public or readers, in general. My role as an intern was to address these two issues; namely, by designing a website that would host a digitized version of the catalog, and by creating several visualizations that would enable a greater understanding of how manuscripts might interact with one another.

Though everyone involved in this project — Professor Treharne, previous project manager, Eren Yurek, and I — knew that we wanted to create a website, the question about which platform we would use was undecided. For the first few weeks, we considered GitHub, Omeka S, and WordPress, ultimately deciding on GitHub. We thought it best to use Jekyll — a static site generator; not only would this allow for the website to be sustainable, as static sites are less prone to glitching when there are new updates, but Jekyll also offers pre-designed themes that users can then leverage to build their own sites.
After I had finished setting up the layout of the website, I began inputting information. Previous interns had worked at digitizing the individual manuscripts and my job was to compile the PDFs. Currently, the website houses this digitized and searchable version of Ker’s 1957 Catalog as well as previously unpublished lectures of Ker’s that have since been transcribed by my fellow co-intern Creagh Factor.
This website is also unique in that it contains the visualizations I created. The first step in this process was to code a Python script that would identify and then extract key information such as the “provenance,” “shelfmark,” “date,” and more from each digitized document. This information was compiled into a database spreadsheet and after I had cleaned data, I used platforms like R and Flourish to construct five visualizations. Each aims to re-contextualize how scholars view manuscript production pre-1200.
Figure 2. In his Catalogue, Ker assigned each manuscript a specific genre. However, for the sake of readability, we were unable to graph every single genre and thus, we decided to graph the 7 most common genres as determined by Ker and our research team.
We have discovered a great deal about these surviving manuscripts: “glosses” are the most common genre; and while Ker found manuscripts written as early as 550AD, most that still exist appear to have been written in the early half of the eleventh century. As a research team, we were also particularly interested in delving deeper into Ker’s own perspectives on these manuscripts. As such, we decided to analyze the length of his manuscript entries, discovering that the median entry length was around 229 words and that most entries are between 121 words and 550; that being said, one entry (Item 331) contains over 4000 words
Figure 3. Medieval dating conventions are not very specific as they do not point to a specific year but instead a date range. As a result, there is uncertainty in this graph; all of the dates included are approximations that have been made based on Ker’s best knowledge as well as our own.
There were high levels of uncertainty when working with this data, as is expected when it comes to medieval texts. Of the 480 manuscripts, only 268 could be tied to a specific provenance, or city of origin. However, because we were still curious as to which cities were most integral to the production of pre-1200 manuscripts, I decided to use ArcGIS Online to map out the manuscript localizations that we did know. The size of the dot represents the amount of manuscripts that were found in that specific city (i.e. the larger the dot, the more manuscripts) while the transparency of the dot is associated with the average word count length of Ker’s description per manuscript (i.e. the larger the average word count length, the darker the dot). Ultimately, I found that the English cities of Canterbury, Worcester and Winchester produced the most manuscripts (with 43, 39 and 26, respectively).
Figure 4. Once one clicks upon a city on the ArcGIS map, a pop-up appears which gives further information as to each of the manuscripts that were written there; the pop-up includes the item number, the shelfmark, the medieval provenance, the date, the title/genre, and the total word count of Ker’s description.