Bringing to life the Living Archive of Aboriginal Languages

paper, specified "short paper"
  1. 1. Cathy Bow

    Charles Darwin University

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

Bringing to life the Living Archive of Aboriginal Languages


Charles Darwin University, Australia


Paul Arthur, University of Western Sidney

Locked Bag 1797
Penrith NSW 2751
Paul Arthur

Converted from a Word document



Short Paper


sustainability and preservation
resource creation
and discovery
and Open Access

Thousands of resources in dozens of Australian Indigenous languages are becoming available online through the Living Archive of Aboriginal Languages ( Many of these resources were developed in bilingual school programs in remote schools across the Northern Territory over several decades, and more materials continue to be added from other communities and languages of this region. This archive provides access to a wealth of written and illustrated texts in endangered languages, many of which were developed from stories told by community elders detailing ancestral knowledge and life, as well as everyday life and contemporary stories, primers, translations from English, instructional materials, memoirs and other genres. These texts are useful as an academic resource for various domains, such as linguistics, anthropology, education, history, natural sciences, Indigenous knowledges, environmental studies, and visual arts. They are also (re)finding their place in the cultural lives of Aboriginal people living on traditional lands, now that there are only a handful of officially sanctioned bilingual programs. The coming to life of this archive opens up significant questions in the field of digital humanities, around postcolonial digital archiving, the curation and uses of cultural heritage materials, the documentation of endangered languages, and studies of the appropriation by Indigenous peoples of digitising technologies for their own ancestral knowledge and culture work.
The creation of the Living Archive has involved the digitisation of thousands of small books, the identification and collation of metadata, the development of a simple interface for search and browse, and efforts to locate the original storytellers and illustrators to seek permission to make these materials public on an openly accessible website (Bow et al., 2014). While the archive serves as an infrastructure for research and other academic work, one challenge was to design a web interface that would also meet the needs of Indigenous community members. The use of a map interface to allow navigation by place or language, and the presentation of books by cover image, both allow the site to be navigated with minimal or no textual input. Standard search and browse functionality, including full-text search, allows more conventional text-based interaction with the materials. User testing included Indigenous users who provided a useful perspective on the website as well as the content and potential uses of the archive. The engagement of the traditional owners of the materials and content has been a key component of developing a generative, agentive archive, rather than simply a storage place for materials.

Screenshot from the Living Archive of Aboriginal Languages website,
The process of bringing the archive to life has involved carefully articulating and enhancing how its liveliness brings communities together with their schools, old people with young, and students and researchers from around the world with knowledge authorities in the communities of origin. It has been important to maintain careful consideration of traditional approaches to ownership of knowledges in terms of negotiating copyright and intellectual property, while balancing ethical issues around privacy and open access. Most of the books were produced for use in schools and contain no secret or sacred knowledge. Copyright for the objects belongs to the organisations that produced them, rather than individuals—for example, copyright for materials produced by schools with bilingual programs belongs to the NT government, which has licensed the digitisation and online display of their materials. Beyond this, however, the project team chose to seek the approval of the original creators of the materials. Identifying and locating these people has been a particular challenge for the project, but was considered important to recognise their moral rights and also involve them (or the families of those who had passed away) in the renewed digital life of their stories and illustrations.
To maximise the discoverability of the materials and access to them, additional points of entry to the archive have also been developed. In response to a request for offline access, a mobile app (LAAL Reader)
1 now allows users to download materials in bulk (for example, from a specific community or language) for storage on local devices to enable their use offline. Schools in southeast Arnhem Land have created libraries on school iPads for classroom use of the materials in local languages using this app. Efforts to ensure interoperability and reusability have involved use of software-independent formats and open-source tools, and long-term security is being negotiated with other archives in case of major disaster. The metadata is available for harvesting through OAI-PMH protocols and accessible through the National Library of Australia’s Trove database; related XML protocols make the materials discoverable through the Open Language Archives Community (OLAC) and the Australian National Data Service (ANDS).
2 One of the challenges of making the data accessible and interoperable has been the use of ISO 639-3 standards for identification and naming of languages, which do not conform to local classification and nomenclature. A dual-level representation of language names was formulated to balance conformity to an international standard and accurate representation of community preferences for language names.

Networking with users and potential users is helping the project team to identify various research questions and uses of the archive in different contexts. For example, working with ANU’s Centre of Excellence for the Dynamics of Language to include materials from the Living Archive in a larger corpus of Australian and Papuan languages will facilitate new avenues of inquiry in data mining and corpus analysis. The archive is included in various academic programs, such as training for Aboriginal Education Support workers through the Batchelor Institute of Indigenous Tertiary Education, and the Bachelor of Indigenous Languages and Linguistics degree at the Australian Centre for Indigenous Knowledges and Education. Proposals for research projects are included on the project’s accompanying web site, including evaluation of the site in terms of its usability for educational purposes, the use of resources on the archive for developing language learning programs, and collaborative activities with language owners on the use and revitalisation of stories in communities where language is no longer embedded in the curriculum. The project also maintains a social media profile, in an effort to engage a range of different users in schools, remote communities, and academic contexts.
The extension of the archive includes the development of an API to give appropriate authorities a ready way of curating, extending, and customising their collections. This will allow for enhancements under the authority of original story owners, such as correcting or extending metadata, associating audio and video materials, and creating appropriate categories and themes for the existing materials. Workshops in communities are training and equipping people to engage with the archive, promoting intergenerational knowledge transmission and collaboration within schools and community groups. Such engagement and its outcomes can inform further research activities in the wider domain of digital humanities.
The documentation of endangered languages has much to offer the field of digital humanities (Drude et al., 2012). The Living Archive project provides authentic data from minority languages in danger of extinction, on which research techniques and tools developed for major world languages can be tested and extended. The development of computational and statistical methods of analysis, annotation, markup, glossing, and the like can support and enhance the work of language documentation and conservation—for example, by automating some of the manual processes of extracting metadata and improving optical character recognition of minority languages. Data mining, textual analysis, and topic modelling can enhance the content of the archive and provide data for novel research questions for many different disciplines. Besides supporting academic research, such activities should also involve members of the communities where these languages are still important—in collaborative work, skills development, and furthering opportunities to develop digital means of documenting and preserving their linguistic and cultural heritage.
The affordances of such a body of literature are yet to be fully explored, and situating the archive within the context of digital humanities exposes it to new research contexts, which should be negotiated under the authority of the appropriate story owners. Curation and storage of cultural material is a contested space, with a number of constraints on access and ownership. This project is negotiating these cultural, technical, and epistemological challenges in its attempt to create a Living Archive rather than simply a storage facility or resting place for these valuable materials.
2. Trove:; OLAC:; ANDS:


Bow, C., Christie, M. and Devlin, B. (2014). Developing a Living Archive of Aboriginal Languages.

Language Documentation & Conservation,


Drude, S., Trilsbeek, P. and Broeder, D. (2012). Language Documentation and Digital Humanities: The (DoBeS) Language Archive. Presented at
Digital Humanities 2012, Hamburg,

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info


ADHO - 2015
"Global Digital Humanities"

Hosted at Western Sydney University

Sydney, Australia

June 29, 2015 - July 3, 2015

280 works by 609 authors indexed

Series: ADHO (10)

Organizers: ADHO