Many of the online projects in the digital humanities have an implied planned obsolesce –which means that they will degrade over time once they cease to receive updates in their content and software libraries (Fitzpatrick 2011). We presented papers at Digital Humanities 2017, 2018, and 2019 that explored the abandonment and the average lifespan of online projects in the digital humanities (Meneses and Furuta 2017), contrasted how things have changed over the course of a year (Meneses et al. 2018), and introduced a strategy for preservation by creating standalone software executables (Meneses et al. 2019). However, managing and characterizing the degradation of online digital humanities projects is a complex and pressing problem that demands further analysis.In this sense “planned obsolescence” is a nuanced designation —as there are many cases of successful projects in digital humanities that are shifting their focus from active development to data management (for example: http://cervantes.dh.tamu.edu). These are cases where a project’s online presence has not received updates for some time but its online tools are stable and continue to be accessed by its users. However, if updates are not applied to the infrastructure or content of a project over time web requests will eventually start generating errors on the server or the client —affecting the overall user experience (Nowviskie and Porter 2010). These are examples of why the rules for traditional resources do not fully apply and new metrics are needed to identify issues concerning online projects in the digital humanities.In this study we dive deeper into exploring the distinctive signs of abandonment to quantify the planned obsolesce of online digital humanities projects. In our workflow, we use each project included in the Book of Abstracts that is published after each Digital Humanities conference from 2006 to 2019. We then proceed to periodically create a set of WARC files for each project, which are processed using Python (van Rossum 1995) and Apache Spark (Apache Software Foundation 2017) to statistically analyze the retrieved HTTP response codes, number of redirects, DNS metadata and detailed examination of the contents and links returned by traversing the base node. This combination of metrics and techniques has allowed us to assess the degree of change of a project over time. As one of the results from our 2019 presentation, we claimed that the most important signature for degradation comes from the assessing the validity and overall health of the topology of links in a project. Thus, the focus of our study is analyzing this key signature.We acknowledge that research on the preservation of projects in the digital humanities is also carried out by other groups (Larrousse and Marchand 2019) (Arneil, Holmes, and Newton 2019). However, our study is different as it focuses on two points: first, identifying the signals of abandoned projects using computational methods; and second, quantifying their degree of abandonment. In the end, we intend this study to be a step forward towards better preservation strategies for the planned obsolesce of online digital humanities projects.
If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.
Hosted at Carleton University, Université d'Ottawa (University of Ottawa)
Ottawa, Ontario, Canada
July 20, 2020 - July 25, 2020
475 works by 1078 authors indexed
Conference cancelled due to coronavirus. Online conference held at https://hcommons.org/groups/dh2020/. Data for this conference were initially prepared and cleaned by May Ning.
Conference website: https://dh2020.adho.org/
Series: ADHO (15)