Academy of Ethiopian Languages and Cultures - Addis Ababa University, Institute for Comparative Literature and Society - Columbia University
Considering the stunning advancements in communication technology, some may find it hard to imagine that at the outset of the digital age 50% (Harrison, 2007) to 90% (Kraus, 1992) of languages are at risk of extinction this century. This dire forecast relates to a complex amalgamation of factors including the history of colonization (Mufwene, 2002; Osborn, Anderson, and Kodama 2009), oppression of minority and Indigenous communities, migration, innovations in global transport, international media consolidation, and urbanization (Simons & Lewis, 2013). Digital technologies also appear to be contributing to a pattern of mass language extinction. This may be particularly surprising considering the rhetoric around digital technologies’ potential to connect the world and democratize access to information and communication. When the diversity of human languages and their written scripts are considered, the challenge of achieving these aspirations without undermining language and script diversity comes into focus. While there exists a potential for digital technologies to support linguistic diversity in all its richness, this trajectory is by no means fated. This paper addresses key factors and efforts required to achieve such a future as well as the roadblocks that may prevent many languages from surviving the digital age.
“Digital extinction” will be the fate, according to Rehm (2014), of languages that suffer from insufficient technology support. This is a three-pronged process. First, if digital technologies make the use of a language impossible or inconvenient, a language community will suffer a loss of “function” as other languages take over tasks such as email, texting, search, e-commerce, etc. Second is a simultaneous loss of “prestige” associated with the absence of a language within the high-prestige technological realm. Digital technologies were first developed in English-speaking contexts and spread internationally during a period of increasing English dominance after the Cold War; thus, English has gained prestige in the digital sphere while the prestige of many other languages has declined. The third prong of digital extinction is the loss of “competence” which occurs as it becomes increasingly difficult to raise a “digital native” competent in the language. With digital technologies playing an increasing role in human communication, the lack of digital use of a language is likely to become an increasingly central factor in language extinction more generally. Considering that Kornai (2013) predicts that less than 5% of languages will achieve full vitality in the digital realm, the scope of this concern becomes apparent.
Ultimately, what does it matter if global communication is achieved at the expense of language diversity? Language is interconnected with identity, culture, and an intergenerational sense of belonging (Harrison, 2007). Therefore, the users of under-resourced languages clearly stand to lose the most from digital extinction. However, the loss of language diversity impacts us all. Language is tied up with human knowledge, and when we lose one we lose the other (Harrison 2007; Evans 2010). Harrison (2007) argues that “the extinction of ideas we now face has no parallel in human history” (p. viii), while Hale states that losing a language is like “dropping a bomb on a museum, the Louvre” (cited by Harrison 2007, p. 7). While the digital age promises us global access to a grand storehouse of knowledge, we may lose more than we gain if digital extinction goes unchecked.
Technological support for full digital viability of a language consists of a variety of factors. Foundational are digital standards, or the “protocols” that determine which languages’ written scripts are supported by digital devices, software, and platforms. As such, digital standards play a key role in determining which languages are included or excluded from digital communication (DeNardis, 2014). While more and more languages are supported by foundational standards, language “inclusiveness” has primarily taken place through the process of technology companies targeting new, profitable markets of language users. Left behind are language communities too small or too poor to be considered viable target markets.
The central research questions are: How is language diversity threatened or bolstered by digital communication technologies? What can be done to ensure languages survive the digital age? I identify the primary hurdles in terms of technological design and governance of digital technologies that disadvantage minority and Indigenous languages. Last, I identify best practices for how these barriers can be overcome, including policy best practices for digital governance institutions to meet the needs of digitally-disadvantaged language communities.
The impact of digital technologies on language diversity has been radically understudied because of the complex and cross-disciplinary nature of the research area. While linguists and anthropologists study language shifts, they typically lack the technical expertise to understand how digital design and governance impact language choices in the digital sphere. Similarly, tech designers and computer scientists frequently lack awareness of the implications of their work on language diversity. This interdisciplinary research responds to the conference’s call to build complex models of complex realities, analyze them with computational methods, and communicate the results to a broader public.
I utilize an instrumental case study method, suited to exploratory and cross-disciplinary research. I focus on the case of the digitization and standardization of the Ethiopic script culminating in its inclusion in the dominant character encoding standard Unicode as well as ISO/IEC 10646, which is kept in-synch with Unicode. I also consider other forms of digital support developed for the Ethiopic script and its languages, including Ethiopia’s national language Amharic. The case of Ethiopic is uniquely informative in that it was the first indigenous African script to be included in Unicode and ISO/IEC 10646. This case also demonstrates many common challenges that affect other digitally-disadvantaged languages and scripts, such as Mongolian and N’Ko (Rosenberg, 2011).
Research methods include in-depth qualitative interviews with key actors, including Ethiopic digital pioneers who built early word-processing programs, keyboards, and other basic digital supports for the script and its languages. I also interviewed members of the Unicode Technical Committee and the ISO’s Subcommittee 2/Working Group 2 on character encoding, observed their ongoing work to support digitally-disadvantaged scripts and languages, and did research in their archives about inclusion of the Ethiopic script in their standards.
Furthermore, in order to determine to what extent foundational supports for Ethiopic have allowed for use of the script and its languages in the digital sphere, I conducted a non-traditional content analysis of comments on popular Ethiopian-themed Facebook pages. While rates of utilization of the Ethiopic script have seen increases over the last decade, there are still more Amharic comments written in the Latin script than those written in Ethiopic. This indicates ongoing barriers in support for the script, since transliteration of Amharic into Latin is uncommon outside of the digital sphere.
Last, I chronicle ongoing work, for which I am a participant-observer, to develop layout and formatting rules for Amharic in partnership with the digital governance institution World Wide Web Consortium (W3C). These rules will allow software and platforms to accurately support the unique characteristics of the Ethiopic script and Amharic language. Challenges include collecting Ethiopic publishing expertise from stakeholders unfamiliar with the W3C, some of whom are not online. The “instrumental” nature of the Ethiopic case means that throughout I situate this history in the context of larger trends affecting digitally-disadvantaged scripts and languages as a group.
Despite concerning trends in terms of digital support for language diversity, the digital age is still young. The digital technologies we use today were shaped by forces in the recent past, but these are all subject to change. People write code. And as Russell (2014), DeNardis (2009), and Osborn (2010) assert, we have an opportunity and a responsibility to shape our technologies to support the future we wish to inhabit. If we wish to preserve and revitalize the diversity of human language, and the wealth of knowledge it contains, digital technologies can help us do that (Yacob 2014; H. B. Russell 2010). If we, and particularly those who design and govern digital technologies, prefer to “let the market decide,” digital technologies will contribute heavily to global linguistic homogenization and the mass extinction of minority and Indigenous languages.
Recommendations include a focus on support for language diversity as the “corporate social responsibility” of global tech giants, an issue of accessibility and equity. This can be promoted by advocates reaching out to companies, voicing needs and connecting them with language expertise as necessary. Digital governance organizations should lower barriers of entry for digitally-disadvantaged language communities to participate and voice their concerns. This may include working with trusted third-party intermediaries, such as the Script Encoding Initiative, which bridges script communities’ needs with the technical requirements of Unicode proposals. Governments can support academia to build digital tools for non-market languages, as well as purchasing digital tools that support national and local languages, creating market incentives to develop them. Collaborations between linguists and technologists are also essential. This presentation will present how we can shape the future of language diversity by closing the linguistic digital divide through advocacy, digital design, and governance of the digital sphere.
DeNardis, L. (2009).
Protocol Politics: The Globalization of Internet Governance. Cambridge, Mass: The MIT Press.
DeNardis, L. (2014).
The Global War for Internet Governance. New Haven: Yale University Press.
Evans, N. (2010).
Dying Words: Endangered Languages and What They Have to Tell Us. Wiley-Blackwell.
Harrison, K. D. (2007).
When Languages Die: The Extinction of the World’s Languages and the Erosion of Human Knowledge. Oxford University Press.
Kornai, A. (2013). Digital Language Death.
PLoS ONE, 8(10), e77056.
Kraus, M. (1992).
The World’s Languages in Crisis. Presented at the Endangered Languages Symposium at the 1991 annual meeting of the Linguistic Society of America.
Mufwene, S. S. (2002). Colonisation, globalisation, and the future of languages in the twenty-first century.
International Journal of Multicultural Studies, 4(2), pp. 162–193.
Osborn, D. Z. (2010).
African Languages in a Digital Age: Challenges and Opportunities for Indigenous Language Computing. IDRC.
Osborn, D. Z., Anderson, D., & Kodama, S. (2008). Support for Modern African Languages and Scripts in Unicode/ISO 10646: Where are We Today? Presented at the 32nd Internationalization and Unicode Conference, San Jose, California.
Rehm, G. (2014). Digital Language Extinction as a Challenge for the Multilingual Web. In
Multilingual Web Workshop 2014: New Horizons for the Multilingual Web. Madrid, Spain: META-NET.
Rosenberg, T. (2011, December 9). Everyone Speaks Text Message.
The New York Times.
Russell, A. L. (2014).
Open Standards and the Digital Age: History, Ideology, and Networks. New York, NY: Cambridge University Press.
Russell, H. B. (2011). Preserving Language Diversity: Computers can be a tool for making the survival of languages possible. http://www.culturalsurvival.org/.
Simons, G. F., & Lewis, M. P. (2013). The World’s Languages in Crisis: A 20-Year Update. In E.
Mihas, B. Perley, G. Rei-Doval, & K. Wheatley (Eds.),
Responses to Language Endangerment: In Honor of Mickey Noonan. New Directions in Language Documentation and Language Revitalization (Vol. 142, pp. 3–20). Amsterdam: John Benjamins Publishing.
Yacob, D. (2006). Unicode for Under Resourced Languages. Presented at the Language Resources and Evaluation Conference LREC, Genoa, Italy.
Yacob, D. (2014). In-depth interview, interviewed by Isabelle Zaugg.
If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.
Hosted at Utrecht University
July 9, 2019 - July 12, 2019
436 works by 1162 authors indexed
Conference website: http://staticweb.hum.uu.nl/dh2019/dh2019.adho.org/index.html
Series: ADHO (14)