Extracting Relationships from an Online Digital Archive about Post-War Queensland Architecture

paper, specified "long paper"
Authorship
  1. 1. Jane Hunter

    University of Queensland

  2. 2. John Macarthur

    University of Queensland

  3. 3. Deborah Van der Plaat

    University of Queensland

  4. 4. Janina Gosseye

    University of Queensland

  5. 5. Andrae Muys

    University of Queensland

  6. 6. Craig Macnamara

    University of Queensland

  7. 7. Gavin Bannerman

    State Library of Queensland

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

The “Architectural Practice in Post-War Queensland: Building and Interpreting an Oral History Archive” project is a collaboration between the University of Queensland, the State Library of Queensland (SLQ) and four of the longest-standing architectural firms in Queensland. The project’s aim is to build a comprehensive online multimedia digital archive that documents architectural practice in post-war Queensland (1945-1975) – a period that was highly significant in Queensland’s architectural history but that remains largely undocumented. The goal was to use innovative Semantic Web technologies to link tacit knowledge extracted from individual oral histories to tangible knowledge (drawings, books, photographs, manuscripts) that exists within personal archives, firm archives as well as State and institutional archives and libraries.

The approach involved firstly conducting and recording a series of oral history interviews and public forums with the key architects from this period. These events comprise both private interviews, one-on-one conversations between the project team and architect/s as well as a number of larger public forums held at the SLQ that focus on a specific theme (education, style, climate, regionalism, etc.)

The oral history interviews and the public forums are filmed, captured as digital files (.wav and .avi) and transcribed. Both manual tagging and text processing tools are applied to the transcripts to semantically tag key entities (architects, firms, structures, places, dates) mentioned in the interviews and extract new knowledge in the form of RDF graphs. The resulting RDF graphs document relationships between architects, firms and buildings (with attribution to the source) and are able to be displayed, edited, saved and re-used via the LORE compound object authoring software (Figure 4).

This paper describes our approach to establishing the online archive and evolving knowledge-base1 that together have been designed to be used for research, teaching and practice within the disciplines of history, architecture and design.

An overview of the system architecture is shown in Figure 1. The system uses the Omeka content management system to support the upload and description of content (oral history files, transcripts, photos, drawings, articles etc.) by the project collaborators. In addition the system provides the following components and functionality:

An OWL Architecture ontology was defined that specifies the core classes, class hierarchy and properties associated with each class (Figure 1);
D2RQ is used to convert Omeka metadata to RDF and save it to a Sesame RDF triple store with a SPARQL query interface;
User-authenticated annotation tools enable users to semantically tag transcripts by identifying people, places, buildings, firms, and events mentioned in the interviews;
The EYE N3 Semantic Reasoner (N3) is applied to the Sesame RDF triple store to reconcile common entities (via URLs) and to infer relationships between key entities (architects, firms, structures/buildings and articles/publications);
A Search and Browse engine (based on Solr) enables users to search for specific entities or perform full-text searching across all transcripts and articles (and jump to the audio/video segments that contain the matching search term);
Word clouds and word frequency histograms (generated from the oral history transcripts using D3) enable architectural historians to understand the main themes and influences on key architects from this period;
Mapping and timeline interfaces enable users to interactively browse and retrieve information (interviews, photos, drawings) about buildings, people or events via maps and timelines;
The LORE tool enables the visualization, editing, sharing and re-use of RDF graphs that document relationships between architects, firms, buildings, and related documents (Figure 3)
At the time of writing this abstract, the archive/database contains 64 interviews, of average length 83 mins. It also contains 64 transcripts, 725 photos, 612 articles, 305 line drawings and detailed information about 464 architects, 119 firms and 357 buildings/structures. The archive is growing continuously as more interviews and associated content are uploaded and annotated. The architectural historians involved in the project and their students, review the transcripts and using the integrated annotation tools identify and tag the names of people/architects, firms/organizations, buildings/structures and places. As new people, structures, firms and places are tagged/identified, they are added to the ontology. Authenticated users can also annotate relationships between people, between people and firms and between people and buildings, by drawing on a controlled vocabulary of relationship types. The reasoning engine then reasons across these relationships to infer new implicit relationships that can be recorded, searched and visualized through the LORE RDF graph visualization tool.

Architects who studied and worked in Queensland during the post-war period are also invited to register, login and submit their own details including a chronology of practice and to provide feedback to the existing content. An additional blog monitored by the project team encourages the broader community (those outside the profession) to comment on aspects of post-war architecture (e.g., nominate their favourite building) and to upload related materials such as photographs or plans.

Future work plans include undertaking a detailed user evaluation of the system with a set of test users that comprises architectural historians from academic, government and industry as well as users from the local architectural community - and refining and extending the system based on user feedback.

Finally, our paper will also describe the challenges that this multi-disciplinary project faces including: how to attract and retain an active community of contributors; ensuring the archive’s sustainability, resolving issues of identity resolution and implementing quality control over the community-generated content.

Biography:
Professor Jane Hunter is the Director of the eResearch Lab at the University of Qld – where she leads a team of post-docs, PhD students and software engineers working on innovative e-research services for a wide range of applications and communities. She has published over 100 peer-reviewed papers on semantic web, digital libraries and e-research and is currently the Deputy Chair of the Australasian Association for Digital Humanities and Chair of the Academy of Sciences Committee for Data in Science. She is a CI on the Mellon-funded Open Annotation Collaboration (OAC) project, the NeCTAR-funded HuNI and Aust-ESE projects and the ARC Linkage Project “Architectural Practice in Post-War Queensland: Building and Interpreting an Oral History Archive”.

Fig. 1: Technical Components underlying the Post-War Queensland Architecture Knowledge Base

Fig. 2: Overview of the Ontology underlying the Post-War Qld Architecture Knowledge-base

Fig. 3: Screen Shot of the Web Portal: Digital Archive of Queensland Architecture

Fig. 4: LORE Visualization and Editing Interface to a Relationship Graph about Karl Langer

References
Digital Archive of Queensland Architecture Web Portal (2014). qldarch.net

I (2014). Omeka. omeka.org

Gerber A. and Hunter, J. (2010). Authoring, editing and visualizing compound objects for literary scholarship. Journal of Digital Information, 11(1), 2010.

Bizer, C. and Cyganiak, R. (2014). D2RQ Accessing Relationsl Databases as Virtual RDF Graphs. d2rq.org

Verborgh, R. (2011). Semantic Reasoning with EYE. n3.restdesc.org

Bostock, M. (2013). D3 Data-Driven Documentsd3js.org

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ADHO - 2014
"Digital Cultural Empowerment"

Hosted at École Polytechnique Fédérale de Lausanne (EPFL), Université de Lausanne

Lausanne, Switzerland

July 7, 2014 - July 12, 2014

377 works by 898 authors indexed

XML available from https://github.com/elliewix/DHAnalysis (needs to replace plaintext)

Conference website: https://web.archive.org/web/20161227182033/https://dh2014.org/program/

Attendance: 750 delegates according to Nyhan 2016

Series: ADHO (9)

Organizers: ADHO