Leveraging Google Sheets and GitHub for Data Curation on the Princeton Ethiopian Miracles of Mary Project

lightning talk
Authorship
  1. 1. Rebecca Sutton Koeser

    Princeton University

  2. 2. Nick Budak

    Princeton University

  3. 3. Rebecca Munson

    Princeton University

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

The creation and curation of humanities datasets is an important scholarly activity that requires labor and expertise, and results in a research output that furthers scholarship (Elswit and Bench). As more scholars become interested in building datasets, we need better and simpler solutions for managing and publishing data. Many humanities datasets make sense as tabular or relational data, but not every scholar or project team has the skills, resources, or desire to create and manage a relational database.As a possible solution to the gap between researchers' skills and technical requirements, we will demonstrate the tools we are prototyping to support data curation in The Princeton Ethiopian Miracles of Mary Project, adding lightweight infrastructure around Google Sheets and GitHub using generalizable scripts. Our approach is to model and structure the data as if implementing it in a relational database, but with the goal of creating a set of related sheets in a single Google Sheets spreadsheet with data validation to link them and avoid redundant data entry (Belcher et al.). We will show a Google Apps Script project that can create a new spreadsheet with configured sheets, fields, and data validation based on a JSON data structure. We will also demo a script that generates a regular, automatic export of the Google Sheets data and commits it to a GitHub repository, resulting a versioned copy available for querying, visualization, automated validation, interface prototyping and publication, leveraging static site technology and minimal computing principles, and eventual data deposit for publication.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2020
"carrefours / intersections"

Hosted at Carleton University, Université d'Ottawa (University of Ottawa)

Ottawa, Ontario, Canada

July 20, 2020 - July 25, 2020

475 works by 1078 authors indexed

Conference cancelled due to coronavirus. Online conference held at https://hcommons.org/groups/dh2020/. Data for this conference were initially prepared and cleaned by May Ning.

Conference website: https://dh2020.adho.org/

References: https://dh2020.adho.org/abstracts/

Series: ADHO (15)

Organizers: ADHO