Large-scale digital libraries, such as the HathiTrust Digital Library (HTDL) and the Internet Archive have emerged consortially, collecting works from institutions around the world. This has led to unevenly biased duplication: some works recur many times in the collections, while others may only have one copy. The Massive Text Lab at the University of Denver is researching levels of ‘sameness’ and duplication of works within these digital libraries through massive-scale analysis. We will discuss applications to modern cataloging standards and provide an overview of the issue and intricacies of duplication, the solutions the project is pursuing, and the value that our work provides in framing material relationships for future humanities scholarship.
If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.
Hosted at Carleton University, Université d'Ottawa (University of Ottawa)
Ottawa, Ontario, Canada
July 20, 2020 - July 25, 2020
475 works by 1078 authors indexed
Conference cancelled due to coronavirus. Online conference held at https://hcommons.org/groups/dh2020/. Data for this conference were initially prepared and cleaned by May Ning.
Conference website: https://dh2020.adho.org/
Series: ADHO (15)