Bridging the Divide: Supporting Minority and Historic Scripts in Fonts: Problems and Recommendations

Deborah Anderson

Authorship

1. Deborah Anderson

University of California Berkeley

Work text

This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

Introduction
Today, users of many modern minority and historic scripts in Unicode are not able to reliably send text electronically, because Unicode-enabled fonts and software are not available.

Especially true for scripts in Unicode versions 6.0 to 9.0 (2010 – 2016), where over 40% of the scripts have no fonts. (Unicode version 10.0 was released in June 2017, so support in fonts would not yet be expected). The Google Noto project aims to provide fonts for all approved scripts, but release of fonts is only up to fonts for Unicode version 6.2, released in 2012.

In addition, some communities have access to Unicode fonts, but the fonts aren’t used, because they do not provide features deemed necessary, such as positioning of characters (e.g., Egyptian Hieroglyphs
[Richmond and Glass, 2016]) or variant glyphs (e.g., Old Italic [Anderson, 2017]). Instead, images are used, which are not searchable or, alternatively, “hacked” fonts are employed, which require each person to have the same, non-standard font to send text. Keyboards or other input mechanisms are also not available for many of these same scripts. As a result, the promise that Unicode will “enable people around the world to use computers in any language” (Unicode Consortium, 2018a), does not yet ring true for some communities.

This short paper will highlight font-related problems with specific examples and will provide suggestions on how to address them.
Problems

Creating a Unicode-enabled font for a language is often not a simple task, especially when the script for the language includes combining marks (which require correct positioning), or if the script has special rendering behavior, such as the consonant clusters found in South Asian scripts (Evans, 2017).

Font creation is made more challenging when typographic details on the script (and language) are not available. Since many recently approved scripts in Unicode are not well known, information on the typography is not readily available. Unfortunately, fine details are often not included in Unicode proposals for the scripts.

Interaction with the user community is critical in developing a suitable font, but some communities are difficult to contact. In addition, there can be differing views on the preferred shapes of glyphs. For a set of 51 Tamil numbers and fractions, for example, the community took 8 years to come to agreement on the preferred representative shapes. Specific cases will be cited, based on the author’s experience, including discussion of how to connect user communities with font providers.

Technical Issue: Glyph Variants

For some script users, access to glyph variants is important. This is true, for example, for the Old Italic Unicode block which unified several related alphabets of Italy, dating from approximately the 8 until 1c BCE. In Old Italic, the glyph in a particular alphabet may vary from that shown in the Unicode Standard.

The Old Italic block was encoded with the understanding that different fonts would be used for the different languages and alphabets (Unicode Consortium, 2017). How should the two forms of Faliscan (above) be handled in the same font then? How should a pan-Old Italic font handle the different alphabets (which use the same code points)?
This paper will describe the pros and cons of different options available, including use of:

Code points in Unicode’s Private Use Area (with the caveat that these code points would not be reliable for general interchange) (Unicode Consortium, 2018c).
A Unicode variation sequence, when a distinction needs to be captured in plain-text (Unicode Consortium, 2018d).
An OpenType font feature, such as character variants, stylistic alternates, stylistic sets, or localized forms (Microsoft Typography, 2018).
Language-specific fonts (i.e., Faliscan1 and Faliscan2 fonts for the two forms above).

ADHO / EHD - 2018

"Puentes/Bridges"

Hosted at El Colegio de México, Universidad Nacional Autónoma de México (UNAM) (National Autonomous University of Mexico)

Mexico City, Mexico

June 26, 2018 - June 29, 2018

340 works by 859 authors indexed

Conference website: https://dh2018.adho.org/

Series: ADHO (13), EHD (4)

Organizers: ADHO

Bridging the Divide: Supporting Minority and Historic Scripts in Fonts: Problems and Recommendations

1. Deborah Anderson

ADHO / EHD - 2018

"Puentes/Bridges"