The place of models and modelling in Digital Humanities: Intersections with a Research Software Engineering perspective

paper, specified "long paper"
Authorship
  1. 1. Arianna Ciula

    King's College London

  2. 2. Paul Caton

    King's College London

  3. 3. Ginestra Ferraro

    King's College London

  4. 4. Brian Maher

    King's College London

  5. 5. Geoffroy Noël

    King's College London

  6. 6. Miguel Vieira

    King's College London

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

This paper aims to bridge Digital Humanities (DH) and Research Software Engineering (RSE) communities. It argues that the production of models is the core contribution of RSE to the epistemology of DH. We adopt an inclusive definition of models and modelling which spans the whole range from ‘deformative’ to empirical modelling (see Smithies 2017: 168), including formal or predictive modelling, and the technical solutions produced in the process as well as the know-how, languages and documentation which accompany this production. From this wide perspective, models are also artefacts which can be studied across the history of science and of the humanities tradition and in comparison with other modelling practices in science. RSE practice is grounded on a strong conscience of the experimental apparatus and the iterative critique of models built (and often deflated) for a purpose. The challenge is to recognise the idiosyncrasy and situatedness of modelling practices and artefacts while devising methods to expose the scalability of the underlying workflows and modelling processes.We reflect on the epistemology of DH from the practical perspective of our RSE lab - King’s Digital Lab (KDL)2 - and the research processes embedded in our Software Development Lifecycle (fig.1). The human element is at the core of the technical ecosystem we research and operate in. We acknowledge that KDL models and modelling are co-constitutive of human expertise, technical systems and operational methods, all aspiring to an environment conducive of open knowledge. This is not only in terms of development and management approaches via adoption of open standards, open source and exposure of open data but also, more fundamentally, in sharing (achievements and struggles around) our processes and in promoting open models. We will use project-specific examples, including data modelling and knowledge representation practices, to demonstrate how some of our research into model-making processes challenge the perception of the technical work of RSE within DH as a stale, mechanistic and uncritical procedural activity.Figure 1 KDL Software Development Lifecycle (SDLC) mapped to Agile DSDM project phases (Agile Business Consortium 2014).KDL operates within a rather unique context. It claims its origins in the pioneering work of colleagues at King’s College London working in applied computing in the Humanities already in the 1970s (see Short et al. 2012). However, the crossings between RSE and DH communities at King’s and internationally are only recently being highlighted and explored. We argue that the study and building of models is one of the dimensions via which these crossings emerge more vividly with substantial epistemological implications and innovative ramifications for the field of DH as a whole.The core practice of research in DH is modelling, which implies the translation of complex systems of knowledge or conceptual frameworks into computationally processable models or operational frameworks.Gooding (2003) claims that in experimental settings computational approaches are analogues to other processes of abstraction, measurement and contextual interpretation, whereby reduction of complexity is followed by expansion in the guise of a double funnel-shaped process. We can trace these processes of reduction and expansion also in the RSE context, where, for example, operationalisation makes models formalised into snippets of code or software components. Other languages than translation into code play a role in the process, however.3The paper will reflect on models as artefacts of different kind expressed via a variety of languages, including but not limited to computational models, produced during several phases of the SDLC (see Ciula and Smithies, forthcoming) such as:negotiations around the meaning of the project units of analysis documented in diagrams and definitions which shape requirements and an agreed project language;paper or whiteboard sketches used to draft the solution architecture for a project in its feasibility assessment;wireframes and static mockups of user journeys;data models implementing the logical structure of a database;statistical models implemented with ad hoc algorithms and code or relying on tested formulas and existing libraries.While specific instantiations of models (for example in relational databases) can have a rather short life, in the RSE context and projects where they were designed and developed, they often represented innovative solutions which have a longstanding effect. Indeed, while models are temporary pragmatic solutions to address specific project challenges yet, at the scale KDL operates, they are the backbone of the team's tacit knowledge as well as the building blocks towards more generalisable and re-usable approaches. They mediate and bridge the layers of the Lab’s socio-technical system: team expertise, data and technical systems.Figure 2 Multilayered socio-technical system of the Lab where concentric circles denote co-constitution of team expertise, data and technical systems.While we can refine existing approaches to sustain and expose modelling efforts in SDLC cycles which rely on RSE best practices such as attention to documentation and re-use,4 there is also room for more innovative approaches, including a design first culture, workflow integration across RSE roles, the assessment and potential adoption of a set of modelling notation languages for the exposure mechanisms for our models.In alignment with a “critical modelling” approach (see Bode 2020), but also with a material culture and media literacy perspective, in this paper we aim to reflect upon models by looking at the wide epistemological implications of their production and use, at the responsibilities of modellers, at how models come to be and what effect they have in the resources they contribute to instantiate and hence in interpretative processes of expansion as well as reduction. A preliminary version of this paper was presented at the symposium Computational Text Analysis and Historical Change, held at Humlab, Umeå University (Sweden), 4-6 September 2019. The authors currently cover different roles at King’s Digital Lab (KDL), a Research Software Engineering (RSE) team hosted by the Faculty of Arts & Humanities at King’s College London, which provides software development and infrastructure to departments in the Faculty of Arts & Humanities while also collaborating with Social Science & Public Policy as well as a range of external partners in the higher education and cultural heritage sectors. Data modelling, for example, is informed if not driven by communication and collaborative reasoning around more or less standardised graphical representations and notations in phases of reverse engineering as well as design methods. Note that KDL intend and use design methods in a wide sense ranging from techniques of requirements elicitation in pre-project analysis to data modelling and wireframing in evolutionary development (see Bennet et al. 2005). Equally, the re-integration or expansion of modelling efforts into interpretative frameworks usually rely on verbal and visual language to document code, or to explain the results of an experiment (Ciula and Marras 2019: 39). With this respect see, for example, the KDL checklist for assessment of digital outputs within the UK Research Framework Exercise. Models contribute to define and redefine objects of study which come charged with layers of scholarship and analysis, with previous selections, bias and political as well as ethical responsibilities. As the creators of new memory regimes and intermediaries to the past engaged in modelling efforts which interact and affect the materiality of our objects of study, we bear responsibilities. In line with ongoing discussions around the representativeness and constraints of the digital archive presented some lucid analysis around modellers’ responsibilities in digital literary studies by exposing the gaps that propagate from produced literary works we know of to material preserved in the analogue archive, to selections of works that make it into digital archives to further reductions in the creation of a corpus of analysis and, last but not least, in the application of statistical modelling techniques which dictate additional powerful yet limiting constraints if not contextualised critically within an interlocked chain of bias.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2020
"carrefours / intersections"

Hosted at Carleton University, Université d'Ottawa (University of Ottawa)

Ottawa, Ontario, Canada

July 20, 2020 - July 25, 2020

475 works by 1078 authors indexed

Conference cancelled due to coronavirus. Online conference held at https://hcommons.org/groups/dh2020/. Data for this conference were initially prepared and cleaned by May Ning.

Conference website: https://dh2020.adho.org/

References: https://dh2020.adho.org/abstracts/

Series: ADHO (15)

Organizers: ADHO