Having a Ball: A Linked Data Approach to Fancy Dress in Colonial Australia

paper, specified "short paper"
Authorship
  1. 1. Tommy Gatti

    Australian National University

  2. 2. Terhi Nurmikko-Fuller

    Australian National University

  3. 3. Paul Pickering

    Australian National University

  4. 4. Ben Swift

    Australian National University

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.


Introduction

The
Lord Mayor's Costume Balls in Sydney in 1857 and 1879 (LMB) is a prototype that focuses on a single page in a vast archive: a list of attendees at a fancy-dress costume ball, hosted by the Lord Mayor of Sydney in the British Colony of New South Wales in 1857, and published in the
Sydney Morning Herald, the colony’s leading newspaper. The tabular dataset is structurally simple, containing the names, titles, and the fancy-dress costumes worn by 994 invited guests: it is captured in 6,347 RDF triples.  A prosopographical analysis of this list provides insight into the vicissitudes of Sydney’s socio-political composition. 

The overarching archive and the LMB ontology have been described elsewhere (Nurmikko-Fuller and Pickering, 2021): the latter is simple, but fit-for-purpose to prove the suitability of Linked Data (LD) for enriching scholarship into the origins of Australia’s modern politics. 

Background

The value of connecting complementary data across disparate datasets has been a feature of the study of Australian history for decades (e.g. Pope and Withers, 1993; Holman, et al., 1999; Moses, 2004). Outside of Australia, LD has been applied very successfully to historical investigations (Rantala et al., 2021; Schmidt and Eggert, 2019; Kaplan et al., 2021; Meroño-Peñuela et al., 2015; Dijkshoorn et al., 2014; de Boer, 2015). Although the potential is clear, few projects have successfully combined the two. Part of the problem is that SPARQL endpoints can be cryptic, lack helpful error messages or executable suggestions, and require prior knowledge of the syntax (Ngonga et al., 2013)., Past attempts to solve this problem (Russell et al., 2008; Lohmann et al., 2016; Haag et al., 2015; Ochieng, 2020; Pradel, 2014; Yang et al., 2018) have not explicitly focused on Humanities data.
In recognition of this dichotomy, we developed a bespoke user interface that enables researchers with little or no prior experience of SPARQL to engage with the LMB’s knowledge graph. We have dubbed this the LMB SPARQL Explorer.

Bespoke UI

The SPARQL Explorer (Figure 1) consists of the Suggestion Generator (SG); Canvas, and Graph-to-SPARQL compiler. It is a single-page web application made using React, hosted on an Apache web server, utilising an Blazegraph-mandated API.

Figure 1: The LMB SPARQL Explorer
A deliberate design feature was ontology-agnosticism: it should work equally well with any ontology. The SG enforces syntactic integrity through querying the triplestore for (any) ontologies; when found, they are cached locally and deconstructed into constituent Classes and properties whilst retaining metadata (e.g. comments, notes). Semantically useful queries are generated through the detection of each domain and range. The Canvas converts the query into a set of .SVG elements and displays it as a graph stored in two arrays, one for the state of nodes and the other for edges. A 1-1 mapping between the graph format and query language syntax ensures that every valid graph is a valid SPARQL query.

Example Query 

Figure 2 illustrates the function of the SPARQL Explorer. The graph has been created by the user dragging suggested Classes and properties from the grey panel on the far right into the white space. Behind the scenes, a SPARQL query is generated.

Figure 2: Query and its representation in the LMB SPARQL Explorer
This query (left, Figure 2) produces a result set of 32 individuals who have the title of “Mr” and a costume categorised as “royal”. It is small enough to enable domain experts to use tacit prior knowledge to infer further knowledge about who among them were royalists, dressed in homage to their monarchical ideals, and how many, in turn, donned royal garbs as a form of satire.

Preliminary Evaluation 

The SPARQL Explorer was preliminarily tested with bespoke (JazzCats ontology, the LMB ontology) and ISO-standard ontologies (CIDOC-CRM and FOAF). For three out of the four, every possible path was representable: there was 100% coverage. The JazzCats ontology had 84% coverage: properties and Classes that had blank nodes as domains and/or ranges were inaccessible. 

Conclusion

We have reported on a prototype web-based interface that can leverage any ontological structure to deliver syntactic validity and semantically useful queries over RDF without the need to learn SPARQL explicitly. Our preliminary testing has shown that a conceptual mapping between a visual query language and SPARQL is possible. What has been achieved is portentous: a pointer to a way forward for domain experts to seek richer answers by asking more complex questions of their (Linked) data.

Bibliography
de Boer, V. (2015).
Linked Data for Digital History. Semantic Web for Scientific Heritage, Proceedings of the Twelfth Extended Semantic Web Conference, Portoroz, Slovenia, March 2015.

Dijkshoorn, C., Aroyo, L., Schreiber, G., Wielemaker, J., and Jongma, L. (2014).
Using linked data to diversify search results: a case study in cultural heritage, Proceedings of the Nineteenth International Conference on Knowledge Engineering and Knowledge Management, Linkoping, Sweden, November 2014.

Haag, F., Lohmann, S., Siek, S., and Ertl, T. (2015).
QueryVOWL: A visual query notation for linked data’, Proceedings of the Twelfth Extended Semantic Web Conference, Portoroz, Slovenia, March 2015.

Holman, C., D'Arcy, J., Bass, J., Rouse, I.L., and Hobbs, M.S.T. (1999). Population
‐based linkage of health records in Western Australia: development of a health services research linked database,
Australian and New Zealand Journal of Public Health, 23,5: 453-9

Kaplan, F., Oliveira, S. A., Clematide, S., Ehrmann, M., & Barman, R. (2021). Combining visual and textual features for semantic segmentation of historical newspapers,
Journal of Data Mining & Digital Humanities, HistoInformatics, 19 January, 2021.

Lohmann, S., Negru, S., Haag, F., and Ertl, T. (2016). Visualizing ontologies with VOWL,
Semantic Web, 7, 4: 399-419.

Meroño-Peñuela, A., Ashkpour, A., Van Erp, M. Mandemakers, K., Breure, L., Scharnhorst, A., Schlobach, S., and Van Harmelen, F. (2015). Semantic technologies for historical research: A survey,
Semantic Web, 6, 6: 539-64.

Moses, A. D. (ed), (2004).
Genocide and settler society: Frontier violence and stolen indigenous children in Australian history, vol. 6. Berghahn Books, New York.

Ngonga, N., Cyrille, A., Bühmann, L., Unger, C., Lehmann, J. and Gerber, D. (2013).
Sorry, I don't speak SPARQL: translating SPARQL queries into natural language, Proceedings of the 22nd international conference on World Wide Web, Rio de Janeiro, Brazil, May 2013.

Nurmikko-Fuller, T., and Pickering, P. (2021).
Reductio ad absurdum?: From Analogue Hypertext to Digital Humanities, Proceedings of the 32nd ACM Conference on Hypertext and Social Media, Dublin, Ireland, September 2021.

Ochieng, P. (2020) ‘PAROT: Translating natural language to SPARQL’,
Expert Systems with Applications: X 5, Article100024.

Pope, D. and Withers, G. (1993) Do migrants rob jobs? Lessons of Australian history, 1861–1991,
The Journal of Economic History, 53, 4: 719-742. 

Pradel, C.,Haemmerlé, O., and Hernandez, N. (2014)
‘Swip: A natural language to sparql interface implemented with sparql’, Proceedings of the Fourteenth International Conference on Conceptual Structures, Iasi, Romania, July 2014.

Rantala, H., Ikkala, E., Jokipii, I., & Hyvönen, E. (2021) WarVictimSampo 1914–1922: a National War Memorial on the Semantic Web for Digital Humanities Research and Applications.
Journal on Computing and Cultural Heritage, 15, 1: 1-18.

Russell, A., Smart, P., Braines, D., and Shadbolt, N. (2008) ‘NITELIGHT: A graphical tool for semantic query construction’,
Proceedings of the Conference on Human Factors in Computing Systems, Florence, Italy, April 2008.

Schmidt, D., and Eggert, P. (2019) The Charles Harpur Critical Archive.
International Journal of Digital Humanities, 1, 2: 279-288

Yang, C., Wang, X., Xu, Q., and Li, W. (2018)
SPARQLVis: an interactive visualization tool for knowledge graphs’,
Proceedings of the Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data, Macau, China, July 2018.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2022
"Responding to Asian Diversity"

Tokyo, Japan

July 25, 2022 - July 29, 2022

361 works by 945 authors indexed

Held in Tokyo and remote (hybrid) on account of COVID-19

Conference website: https://dh2022.adho.org/

Contributors: Scott B. Weingart, James Cummings

Series: ADHO (16)

Organizers: ADHO