Buddhist Murals of Kucha on the Northern Silk Road. An Approach to Semi-Automated Annotation

poster / demo / art installation
  1. 1. Erik Radisch

    Sächsische Akademie der Wissenschaften zu Leipzig (Saxon Academy of Sciences and Humanities in Leipzig)

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

The Buddhist cave complexes in the region of Kucha, located on the northern Silk Road (in the Xinjiang Uyghur Autonomous Region, People’s Republic of China) house impressive wall paintings dating approximately from the 5th to 10th centuries. The first evidence of a past Buddhist culture was discovered at the beginning of the 20th century, after which several countries sent expeditions to the area to investigate the religion that had once been dominant in the region. It was a sensation when various Buddhist cave complexes were discovered, the largest of which included as many as 400 caves. At that time, the first photographs of the actual state of the caves were taken and pieces of the paintings were extracted from the caves and transferred to the respective national museums. Sales and losses due to war led to the fact that nowadays fragments of the murals are spread all over the world, making it very difficult to assign it to the individual caves of origin (Further information: Yaldiz 1987; Popova 2008; Dreyer 2015).
The project presented here has taken on the task of documenting and describing the murals in situ and the individual pieces available worldwide and, with the aid of historical photographs, of virtually reinserting them into their original context.
https://kucha.saw-leipzig.de; https://www.saw-leipzig.de/de/projekte/wissenschaftliche-bearbeitung-der-buddhistischen-hoehlenmalereien-in-der-kucha-region-der-noerdlichen-seidenstrasse/introduction/kucha-murals

The project makes use of modern possibilities of the Digital Humanities in that not only an extensive textual description of individual scenes is carried out, but also the pictorial contents of the repetitive representations are recorded and enriched with digital methods. For this purpose,the digital image annotation tool Annotorious
The Project has its own series «Leipzig Kucha Studies». The fist Book is: Konczak-Nagel/Zin 2020
is used to directly annotate the content with a taxonomy comprising about 1,000 entries.

While annotating objects in the image makes it possible to ensure scientific traceability of identified objects, it is also a very extensive and time-consuming task. Many objects have to be annotated repeatedly because they appear in many images in different contexts. Also, there are sometimes several images of an object from different perspectives available or there are images from the time of the expeditions where detached parts can still be seen in their original context. So there is the necessity to annotate sometimes very similar or same objects several times. However, transferring annotations is difficult. Even if photographs of the same objects are available, different viewpoints and different lenses may cause the photographs to be distorted. It is hardly possible to perform this task automatically using conventional computer vision methods.
For this reason, the project is currently attempting to train region based convolutional neural networks (RCNNs)The project uses for this purpose detectron2

using the annotations already made, in order to be able to perform at least parts of the annotation semi-automatically in the future.

So far, RCNNs have been used in the Digital Humanities mainly to identify, locate and order objects in images (see for example: Howanitz et al. 2019; Arnold/Tilton 2019; Duhaime 2019; Helm et al. 2021). Their use for semi-automated annotation has not been implemented, at least to the knowledge of the author of this poster proposal. It may also be a risky endeavour, since the edges of such detected regions often fray and not the whole object is detected. However, it may be worth a try, if only to test the limits of the application of such systems in the Digital Humanities.
The conditions in this project are very good. Nearly 8,000 polygons already exist, which have been used in a total of nearly 10,000 annotations (a polygon can be linked to several elements of the taxonomy). Some objects have been annotated over 500 times. However, there are also some problems to be considered. For example, there are two fundamentally different types of imagery. On the one hand there are photographs (historical and modern) and on the other hand there are drawings of some paintings. Since these categories of images are very different, they were also separated for the training. It can be assumed that the recognition on photographs should be clearly more difficult, since here paintings are often in very bad condition and are to be identified only with difficulty even by an experienced eye.
A first attempt has been encouraging, because not only some objects were found but for example some of the nimbi around the head were drawn very well. (See figure 1) The poster will present the results (failure or success) of the experiments that will be performed in the next months. It should invite to inform about the project, to share experiences about the use and training of neural networks and to encourage their use.

Example photograph for automated annotation. First number: taxonomy-key, second number: confidence. By far not all possible annotations were detected Nimbi where detected good, rhombi need still further improvement.



. (2019). Distant viewing: analyzing large visual corpora, Digital Scholarship in the Humanities 36(1), DOI:

Dreyer, C. (2015).
Abenteuer Seidenstraße: Die Berliner Turfan-Expeditionen 1902–1914. Leipzig: Seemann.

Duhaime, D. (2019). PixPlot.

Schmideler, S.,
m, C., Mandl, T., Kollmann,
S. and
(2021). Wie sich die Bilder ähneln. Vom Zufallsfund zur systematischen Forschung im Bereich der automatisierten Bildähnlichkeitssuche. In: Burghardt, M., Dieckmann, L., Steyer, T., Trilcke, P., Walkowski, N.-O., Weis, J. and Wuttke, U. (2021). Fabrikation von Erkenntnis: Experimente in den Digital Humanities. DOI: 10.26298/melusina.8f8w-y749-wsdb.

Bermeitinger, B., Radisch, E., Gassner, S., Rehbein, M.
Handschuh, S. (2019, July 11). Deep Watching - Towards New Methods of Analyzing Visual Media in Cultural Studies. Digital Humanities 2019 (DH2019), Utrecht, Netherlands. https://dev.clariah.nl/files/dh2019/boa/0335.html.

Konczak-Nagel, I.
Zin, M. (2020). Essays and Studies in the Art of Kucha (Leipzig Kucha Studies 1). New Delhi: Dev Publishers.

Popova, I. F. (ed.) (2008),
Russian Expeditions to Central Asia at the Turn of the 20th Century: Collected Articles. St Petersburg: Slavia Publishers.

Wu, Y., Kirillov,
Massa, F.,
W. and
Girshick, R. (2019). Detectron2,

M. (1987).
Archäologie und Kunstgeschichte Chinesisch-Zentralasiens (Xinjiang). Leiden: Brill, Handbuch der Orientalistik, Abteilung 7, Kunst und Archäologie, Band 3, Innerasien.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2022
"Responding to Asian Diversity"

Tokyo, Japan

July 25, 2022 - July 29, 2022

361 works by 945 authors indexed

Held in Tokyo and remote (hybrid) on account of COVID-19

Conference website: https://dh2022.adho.org/

Contributors: Scott B. Weingart, James Cummings

Series: ADHO (16)

Organizers: ADHO