International Institute for Digital Humanities
The Theravada Buddhist texts consists of a huge number of scriptures and commentaries. The main parts of them are handed down in one of the Middle Indo-Aryan languages called Pali or Magadhi. So far, they have occupied the most important position in the study of Buddhist literature, since they can be read more strictly than Chinese translations, Sanskrit manuscripts and so on.
Around the turn of the millennium, the Vipassana Research Institute (India), the Dhammakaya Foundation (Thailand) and Mahidol University (Thailand) released a voluminous amount of the digitized texts of the Theravada Buddhist texts written in Pali language. Now that it is possible to read them on personal computers and perform full-text search, most of Buddhologists use the digitized texts for their research instead of printed books and concordances.
However, I wonder if this can be called “digital shift”. We are only reading the digital texts on personal computers, without digitalizing our research at all. Our research method is still the same as before. In order to renew our philological study of the Theravada Buddhist texts and to achieve the digital shift in the true sense, I feel certain that it is essential to structuralize the digital texts. This is the reason why I started to mark up the structure of the scriptures and their commentaries in accordance with Text Encoding Initiative P5 Guideline.
In this poster presentation, I will point out some problems which I encountered when trying to apply the Text Encoding Initiative P5 Guideline to the complicated structure of the annotation system of the Theravada Buddhist texts. To give an example concisely here, may not be available to structuralize the annotation system. This is because the commentaries of the Theravada Buddhist texts sometimes cite sentences from other sources or include popular verses. In these cases, it is necessary to use indicating a quotation and and indicating a verse and a group of verses. However, cannot include , and according to the Text Encoding Initiative P5 Guideline. In the first place, “gloss” (Gk. Glossa, Lat. Glossa) is a simple note or comment added to a piece of writing to explain a difficult word or phrase in the line spacing and margins, based on the definition of
A Dictionary of the English Bible and Its Origins. While “Gloss” is dependent on the main text, the commentaries of the Theravada Buddhist texts are independent of the main texts, i.e. the scriptures. In order to overcome these difficulties and structuralize the annotation system accurately, I decided to adopt with @type as an alternative to . As in the above example, I will show the other difficulties of encoding and how to solve them in accordance with Text Encoding Initiative P5 Guideline in this presentation.
Bibliography
Gilmore, Alec. (2000).
A Dictionary of the English Bible and Its Origins. Sheffield: Sheffield Academic Press.
TEI Consortium. (2021). TEI P5: Guidelines for Electronic Text Encoding and Interchange 4.3.0, https://tei-c.org/release/doc/tei-p5-doc/en/Guidelines.pdf (accessed 31 March 2022).
If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.
In review
Tokyo, Japan
July 25, 2022 - July 29, 2022
361 works by 945 authors indexed
Held in Tokyo and remote (hybrid) on account of COVID-19
Conference website: https://dh2022.adho.org/
Contributors: Scott B. Weingart, James Cummings
Series: ADHO (16)
Organizers: ADHO