This paper describes the history, policy, semantics, and uses of the HathiTrust Research Center Extracted Features dataset, an open-access representation of the 17+ million volume HathiTrust Digital Library, including a major current effort to extend computational access in a variety of more flexible and easily implemented ways, including a modern API supporting customizable visualizations and analyses.
2022
"It Was as Much Ours …": Reader Contributions to Teen Humor Fashion Comics
John A. Walsh
Inks: The Journal of the Comics Studies Society, 2022
Studies of participatory culture in comics frequently focus on the superhero genre and the participatory fandom communities and networks engaged with the superhero genre. This study focuses instead on participatory culture in the fashion subgenre of teen humor comics published from the 1940s through the early 1970s and again in the 1980s. I survey the many types of reader-contributed content in Archie Comics’ Katy Keene comics and Marvel Comic’s teen humor fashion comics such as Patsy Walker and Millie the Model. I examine the fan-led revivals of Katy Keene and Millie the Model in the 1980s. Finally, with the aid of an archival collection of reader submissions, I investigate the reader contributions to Marvel’s six-issue Misty comic created by Trina Robbins. These explorations seek to recover a vibrant participatory community of comic book readers, consisting primarily of young women and girls, actively engaged in producing the comics they are reading.
2021
Digital humanities in the iSchool
John A Walsh , Peter J Cobb , Wayne Fremery , and 8 more authors
Journal of the Association of Information Science and Technology, Jun 2021
The interdisciplinary field known as digital humanities (DH) is represented in various forms in the teaching and research practiced in iSchools. Building on the work of an iSchools organization committee charged with exploring digital humanities curricula, we present findings from a series of related studies exploring aspects of DH teaching, education, and research in iSchools, often in collaboration with other units and disciplines. Through a survey of iSchool programs and an online DH course registry, we investigate the various education models for DH training found in iSchools, followed by a detailed look at DH courses and curricula, explored through analysis of course syllabi and course descriptions. We take a brief look at collaborative disciplines with which iSchools cooperate on DH research projects or in offering DH education. Next, we explore DH careers through an analysis of relevant job advertisements. Finally, we offer some observations about the management and administrative challenges and opportunities related to offering a new iSchool DH program. Our results provide a snapshot of the current state of digital humanities in iSchools which may usefully inform the design and evolution of new DH programs, degrees, and related initiatives.
2019
Encoding Newton’s Alchemical Library: Integrating Traditional Bibliographic and Modern Computational Methods
Meridith Beck Mink , Michelle Dalmau , Wallace Hooper , and 3 more authors
The Chymistry of Isaac Newton (http://chymistry.org) project team has digitized and encoded, following the TEI Guidelines, the complete corpus of Newton’s alchemical manuscripts, which total more than two thousand pages and over one million words. Newton cited more than five thousand published and unpublished works in these manuscripts; many of his annotations reference items in his own library, as he was an exceptionally dedicated reader of alchemical texts. Newton’s extensive citations and annotations provide a window into his alchemical research and practices, and serve as the basis for our authoritative bibliography of his alchemical sources. The bibliography is being developed as both a stand-alone reference work and an integrated resource with the alchemical manuscripts, providing additional context for Newton’s citations and florilegia. Once finished, the bibliography will provide complete, structured citations–which often would appear very abbreviated or incomplete in the manuscripts–that can be formatted to comply with modern bibliographic conventions and bibliographic management systems. Our bibliography will also link to digitized online versions of the source texts available through Early English Books Online, HathiTrust Digital Library, and other digital repositories. The citations include quasi-facsimile title page transcription, a technique used for bibliographic description of rare books, to enable richer forms of citation analysis. By analyzing the citations, we will be able to date Newton’s manuscripts, cluster manuscripts that cite the same or related sources, and, ultimately, generate network graphs that will reveal connections between the cited authors and texts and how they influence Newton’s own ideas and work.
Safe Open Science for Restricted Data
Beth A Plale , Eleanor Dickson , Inna Kouper , and 5 more authors
Open science is prompting wide efforts to make data from research available for broader use. However, sharing data is complicated by important protections on the data (e.g., protections of privacy and intellectual property). The spectrum of options existing between data needing to be fully open access and data that simply cannot be shared at all is quite limited. This paper puts forth a generalized remote secure enclave as a socio-technical framework consisting of policies, human processes, and technologies that work hand in hand to enable controlled access and use of restricted data. Based on experience in implementing the enclave for computational, analytical access to a massive collection of in-copyright texts, we discuss the synergies and trade-offs that exist between software components and policy and process components in striking the right balance between safety for the data, ease of use, and efficiency.
2018
"The Spider’s Web": An analysis of fan mail from Amazing Spider-Man, 1963–1995.
J. A. Walsh , Shawn Martin , and Jennifer St. Germain
This article examines one of the decision-making moments in Petrarch’s editing of his Rerum vulgarium fragmenta (Rvf), when he erases the ballata "Donna mi vene spesso ne la mente" from the partial holograph Vaticano Latino 3195 and inserts the madrigal "Or vedi amor che giovenetta donna" over the erasure, creating a dynamic, palimpsestic relationship between the erased ballad and the Rvf. This shift in the making of the text represents the heart of the artistry of Petrarch’s visual poetics. The study of erasures and palimpsests in the partial holograph and other significant early witnesses helps us understand and trace the history of the work and affects our consideration of modern and contemporary editions of the text. The Petrarchive digital edition of the Rvf (<http://petrarchive.org>) implements new solutions, in the encoding and presentation of the edition, for exposing and highlighting the dynamic, palimpsestic features of Petrarch’s visual poetics.
2015
Literary empires: Mapping temporal and spatial settings of Victorian poetry
John A Walsh , D Becker , B Demarest , and 3 more authors
Information studies, from origins in the field of documentation, has long been concerned with the question, What is a document? The purpose of this study is to examine Christian icons—typically tempera paintings on wooden panels—as information objects, as documents: documents that obtain meaning through tradition and standardization, documents around which a sophisticated scaffolding of classification and categorization has developed, documents that highlight their own materiality. Theological arguments that associate the icon with the Incarnation are juxtaposed with theories on the materiality of the document and “information as thing.” Icons are examined as visual and multimedia documents: all icons are graphic; many also incorporate textual information. Icons emerge as a complex information resource: a resource—with origins in the earliest years of Christianity—that developed over centuries with accompanying systems of standardization and classification, a resource at the center of theological and political differences that shook empires, a primarily visual resource within a theological framework that affords the visual equal status with the textual, a resource with enduring relevance to hundreds of millions of Christians, a resource that continues to evolve as ancient and modern icons take on new material forms made possible through digital technologies. And crist was all, by reason as I preve,Firste a prophete by holy informacion,And by his doctryne, most worthy of byleve.—John Lydgate. Life of Our Lady. IV. II. 309–311We confess and proclaim our salvation in word and images.—Kontakion of the Sunday of Orthodoxy
Comic Book Markup Language: An Introduction and Rationale
Comics, comic books, and graphic novels are increasingly the target of seriously scholarly attention in the humanities. Moreover, comic books are exceptionally complex documents, with intricate relationships between pictorial and textual elements and a wide variety of content types within a single comic book publication. The complexity of these documents, their combination of textual and pictorial elements, and the collaborative nature of their production shares much in common with other complex documents studied by humanists – illuminated manuscripts, artists’ books, illustrated poems like those of William Blake, letterpress productions like those of the Kelmscott Press, illustrated children’s books, and even Web pages and other born-digital media. Comic Book Markup Language, or CBML, is a TEI-based XML vocabulary for encoding and analyzing comic books, comics, graphic novels, and related documents. This article discusses the goals and motivations for developing CBML, reviews the various content types found in comic book publications, provides an overview and examples of the key features of the CBML XML vocabulary, explores some of the problems and challenges in the encoding and digital representation of comic books, and outlines plans for future work. The structural, textual, visual, and bibliographic complexity of comic books make them an excellent subject for the general study of complex documents, especially documents combining pictorial and textual elements.
2011
The liberty of invention: alchemical discourse and information technology standardization
The Chymistry of Isaac Newton project, an online scholarly edition of Newton’s alchemical manuscripts, has engaged in a process to include a number of core alchemical symbols into the Unicode standard, a standard for digital representation of characters and symbols from the world’s languages, scripts, and writing systems. Our article explores the relationship between information technology standardization and humanities research. We discuss Newton’s engagement with alchemy and explore the graphic dimensions of alchemical discourse. We illustrate this discussion with examples of Newton’s use of alchemical symbols. We examine Unicode itself, particularly a core Unicode principle distinguishing between the abstract character and the image or glyph of the character, and we discuss the tensions between this core principle and the representation of graphic, symbolic, and pictorial discourse. We describe our experience with the Unicode proposal process and illustrate again—this time with an organizational scheme for the symbols—how the technical standardization process forced a reexamination of our historical materials. Our conclusions reemphasize the potential for mutually beneficial relationships between certain types of information technology standardization and humanities research and suggest that study of the graphic qualities of alchemical discourse, especially in light of competing theories of text represented by standards like Unicode, may contribute to our understanding of the increasingly graphic, iconic, and pictorial nature of information and communication.
2010
"Quivering web of living thought": conceptual networks in Swinburne’s Songs of the Springtides
Funded in part by the Institute of Museum and Library Services (IMLS), the Indiana University Digital Library Program and Archives of Traditional Music are completing a two-year project to preserve and digitize the university’s extensive Hoagy Carmichael collections. When the Project ends in September 2000, the Project team will have preserved thousands of items, including sound recordings, photographs, sheet music, lyric sheets, and more, pertaining to the life and work of this master of the American popular song. Much of this content is already accessible to the public through a multimedia Web site. More digital content and improved search capabilities will be added in the coming months. While the Project builds upon previous experience and expertise, the complexity of the Project has presented numerous challenges. This paper describes some of these challenges and their resolution, along with a brief discussion of remaining issues.