Revitalizing a South Asian language with Unicode: The case of Sunuwar in Nepal

poster / demo / art installation
  1. 1. Deborah (Debbie) Anderson

    University of California Berkeley

  2. 2. Dev Kumar Sunuwar

    Indigenous Media Foundation, Nepal

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

The Sunuwar, or Koĩts‐Lo, language is spoken by 37,900 users (2011 census) (Eberhard et al., 2021) in Nepal, and is also spoken in Sikkim, India. The Sunuwar language is classified by the EGIDS scale as a “threatened” language, indicating the language is losing speakers (SIL International, n.d.).
This poster describes ongoing work to revitalize the Sunuwar language, using Unicode, as a possible model for other indigenous languages. It will include examples of the script, with sound bites of the language and discuss the steps in getting a script into Unicode.
Although Sunuwar is primarily oral language, a script called “kõits brese” was devised for the language in the 1940s. This script was promoted in the 1960s and 1970s (Sunuwar, 2021), but these efforts were hindered by Nepal’s one language policy:
ek bhasha, ek bhesh, ek dharma, ek desh (“one language, one way of dress, one religion, one nation”). This “one language” policy implied the use of the Devanagari script, which is used to write the official Nepali language. The policy lasted until 1990 (Weinberg, 2013).

In the 2000s, the Government of Nepal developed various plans that promoted mother tongue multilingual education. Currently, 24 school curriculums in different languages have been created, but the medium for these materials for different languages ‐‐ including Sunuwar ‐‐ is in the Devanagari script (Sunuwar, 2021). This situation is due in part to the history of language policies in Nepal, which did not encourage use of scripts other than Devanagari. Another key factor is that Sunuwar, as well as several other scripts of smaller language communities in Nepal, are not in the Unicode Standard, which means creating and exchanging text in the script electronically is very difficult.
Digitization Process and Unicode
The Sunuwar Welfare Society, which was established in 1988, has long envisioned having the language move from an oral language to one whose written version can be sent and received on various digital platforms. Digitization will help to preserve the language and its script and strengthening the distinct Sunuwar identity.

In 2020, the Translation Commons project, a partner with UNESCO’s International Year of Indigenous Languages and International Decade of Indigenous Languages, selected Sunuwar as a pilot project to demonstrate the scalability of encoding scripts for indigenous languages and has facilitated the Sunuwar encoding project.
A key first step towards digitization was to get the script into the Unicode Standard. Fortunately, a Unicode proposal had earlier been written in 2011 by Anshuman Pandey (Pandey, 2011). The proposal was reviewed in 2020 by a core team of linguists, educators, activists, journalists, and language practitioners who are members of the Sunuwar Welfare Society (Sunuwar, 2021). Regular meetings took place between Dev Kumar Sunuwar (Indigenous Media Foundation), Anshuman Pandey and Deborah Anderson (Script Encoding Initiative), Craig Cornelius (Google), and Jeannette Stewart (Translation Commons) which ironed out questions for the Unicode proposal, a font, and keyboard. At the January 2022 UTC meeting the proposal was approved for inclusion in a future version of the Unicode Standard.
To be approved for inclusion in Unicode, evidence showing usage of the script is needed.

Being primarily an oral language used widely in the home, Sunuwar does not have an extensive written tradition nor has it been widely used in Nepal. However, in Sikkim, India, the Sunuwar language (called “Mukhia”) was officially recognized in in 1996, and the Sikkim government published schoolbooks, newspapers and other materials in the script (Sunuwar, 2021). The evidence from Sikkim is useful for demonstrating usage, but evidence from Nepal would be needed to confirm consistent usage across Nepal and India.
Recent promising signs of Sunuwar script usage include:

Creation of a keyboard and PUA‐based font, which can be used to develop more written materials.
Appearance in 2021 of a monthly magazine
Hamso in Sunuwar and Devanagari scripts (in the Sunuwar language) (Sunuwar, 2022).

Creation of YouTube instructional videos and in‐person classes on the script, and an alphabet book.

This project could serve as a model for other communities in Nepal (and beyond) that can learn from the Sunuwar experience on how to get a script into Unicode and revitalization efforts, including any difficulties encountered. More generally, what can be learned from the Sunuwars’ experience of transitioning from an oral culture to a written one that might be applicable to other communities? How can other primarily oral communities build up their written culture, and thus have their script be eligible for inclusion in the Unicode Standard?

The result of the work to get the script into Unicode and widely adopted is yet to be seen. The predominance of Nepali in Nepal and English for business and international communication has tended to overshadow lesser‐used languages. For example, today, English is used as medium of instruction in some private schools in Nepal and may be adopted into public schools as well (Phyakm, 2021).
This work was supported by National Endowment for the Humanities [grant PR-268710-20].


Eberhard, David M., Gary F. Simons and Charles D. Fennig (eds.). (2021).
Ethnologue: Languages of the World. Twenty‐fourth edition. Dallas, Texas: SIL International, online edition.

Pandey, Anshuman. (2011). “Proposal to Encode the Jenticha Script.”‐n4028‐jenticha.pdf (accessed 6 April 2022). (Note: The proposal used the script name “Jenticha,” but this has subsequently been changed to “Sunuwar.”)

Phyak, Phrem. (2021). Language education policy in Nepal and the denial of the right to speak in Indigenous Languages.
Melbourne Asia Review, Edition 7, 2021. (accessed 6 April 2022).

SIL International (no date).
Language Status. (accessed 6 April 2022).

Sunuwar, Dev Kumar. (2021). “Digitizing the script of Koits Sunuwar Indigenous Peoples.”‐the‐script‐of‐koits‐sunuwar‐indigenous‐peoples (accessed 6 April 2022).

Sunuwar, Dev Kumar. (2022).
Hamso magazine. (accessed 6 April 2022).

Weinberg, Miranda. (2013). “Revisiting History in Language Policy: The Case of Medium of Instruction in Nepal.”
Working Papers in Educational Linguistics 28 (1): 61-80. (accessed 6 April 2022).

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2022
"Responding to Asian Diversity"

Tokyo, Japan

July 25, 2022 - July 29, 2022

361 works by 945 authors indexed

Held in Tokyo and remote (hybrid) on account of COVID-19

Conference website:

Contributors: Scott B. Weingart, James Cummings

Series: ADHO (16)

Organizers: ADHO