Improving Named-Entity Recognition on Inscriptions on "ukiyo-e" prints: Towards a ‘Distant Viewing’ in Art History

paper, specified "short paper"
Authorship
  1. 1. Ewa Machotka

    Stockholm University, Sweden

  2. 2. Marita Chatzipanagiotou

    Athens University of Economics and Business, Greece. Note: All authors have contributed equally.

  3. 3. John Pavlopoulos

    Stockholm University, Sweden; Athens University of Economics and Business, Greece. Note: All authors have contributed equally.

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.


Japanese early modern woodblock prints, so-called ukiyo-e or ‘pictures of the floating world’ produced between the seventeenth and mid-nineteenth century, are one of the most widely recognizable visual images today. Among them landscape prints remain the most popular as evidenced by the iconic “The Great Wave” designed by Katsushika Hokusai (1760-1849) and its global career (Guth 2016). However, the understanding of these images is still shaped by Western modern epistemologies that may not be well fitted for the analysis of pre-modern non-Western artefacts (Machotka 2020). The dominant Western modern concept of landscape indicates that landscape images function as representations of places (Andrews 1999) even if art is never a mirror for reality. However, this may not be the case in relation to Japanese early modern prints, which are built on poetic traditions and may have other than representational functions (Chino 2003; Machotka 2012; Shirane 2013). Therefore, to understand the relationships between images and places there is a need to look at Japanese early-modern prints afresh and artificial intelligence has a potential to aid realization of goals. 
The existing discourse on Japanese landscape prints has mainly targeted case studies e.g. selected themes or artists (Clark 2001; Forrer et al. 2011; Kobayashi 2020), the approach which does not allow broad explorations of the geographical distribution of the sites depicted within the prints, their changing frequency in relation to their production context (e.g. time, location, designer) etc. Therefore, we argue that the combination of ‘close reading’ of the artefacts through formal and contextual analysis with so-called ‘distant viewing’ or macroanalysis of visual materials (Taylor and Tilton 2019) based on the idea of ‘distant reading’ proposed by Franco Moretti (2000) for literary studies (Gold and Klein 2016) has the potential to develop a more nuanced understanding of Japanese ukiyo-e prints.
Hence, with this work we propose that distant viewing can be facilitated by Natural Language Processing technologies such as Named Entity Recognition (NER). NER can be used to extract named locations from any text, including titles and other printed inscriptions on prints. Extracted locations can then allow for a digital geospatial macroanalysis of the studied prints, which is currently impossible as the artefacts form an exceptionally large and highly divergent corpus. However, although NER has the potential to improve the study of prints, the current state of the art NER tools are not successful in the identification of artwork titles (Jain and Krestel 2019). This is mainly due to the training data scarcity. Even recent cross-domain datasets only focus on domains such as politics and natural sciences (Liu et al., 2021), leaving art history aside. This problem is especially relevant for the analysis of non-Western pre-modern artefacts such as Japanese prints as inscriptions are rendered in pre-modern scripts used before the standardization of the language in the late nineteenth and twentieth centuries (Frellesvig 2012). In premodern Japanese, the Sino-Japanese characters could be used alternately depending on their phonetic value and the same word could be written in different characters (Yada 2012). Another problem is the ambiguity inherent to the artwork inscriptions or the lack of data. Print inscriptions are not always standardized and metadata in different collections feature different information. These important issues challenge the proposed analysis. 
Lee et al. (2018) were the first to show that transfer learning can lead to state-of-the-art results in NER for English patient note de-identification, by transferring learning from a large labeled dataset to a much smaller one. Following their work, we transferred a generic pre-trained Japanese Convolutional Neural Network NER model (Honnibal and Montani 2017) to the domain of art history, using a very limited training set of 100 labeled data. By using 100 (unseen) labeled data for evaluation, in a prior study (Chatzipanagiotou et al. 2021), we showed that transfer learning can assist NER in the Japanese language and in the field of art history, for the task of place name recognition in inscriptions of landscape prints. We registered an improvement of 28% in Precision, increasing it from 62% to 90%, and more than doubled F1, increasing it from 15% to 36%. We argue that the improved NER already allows distant viewing of the data and we show that there is room for further improvement. The access to data was facilitated by the database hosted at the Art Research Centre at Ritsumeikan University, Kyoto one of the leading Digital Humanities hubs in Japan and a collaborative partner of this project (http://www.arc.ritsumei.ac.jp/en/index.html).

Bibliography
Andrews, M.(1999).
Landscape and Western Art. Oxford Univ. Press. 

Chino, K.(2003). The Emergence and Development of Famous Place Painting as a Genre. Review of Japanese Culture and Society, 15.
Clark, T..(2001)
100 Views of Mount Fuji. Trumbull, CT.: Weatherhill. 

Forrer, M. and Suzuki, J. and Smith, H.(2011)
Hiroshige: Prints and Drawings. Munich: Prestel. 

Frellesvig, B.(2012).
A History of the Japanese Language. Cambridge Univerity Press. 

Honnibal, M. and Montani, I.(2017). SpaCy 2:  Natural Language Understanding with Bloom Embeddings, Convolutional Neural Networks and Incremental Parsing. To appear.
Gold, M., and Klein, L.(2019).
Debates in the Digital Humanities. University of Minnesota Press.

Guth, C.(2016).
Hokusai's Great Wave: Biography of a Global Icon. Hawai’i University Press. 

Jain, N. and Krestel, R.(2019). Who is mona l.? Identifying Mentions of Artworks in Historical Archives. International Conference onTheory and Practice of Digital Libraries. Springer, pp.115–122.
Kobayashi, F.(2020).
文政期前後の風景画入狂歌本の出版とその改題・再印:―
浮世絵風景画流行の前史として―.

浮世絵芸術
, 179: 5-19.

Lee, J. and Dernoncourt, F. and Szolovits, P.(2018). Transfer Learning for Named-Entity Recognition with Neural Networks. The Eleventh International Conference on Language Resources and Evaluation (LREC 2018). pp. 4470-4473.
Liu, Z. and Yu, T. and Wenliang D. and Ji, Z. and Cahyawijaya, S. and Madotto, A. and Fung, P.(2021).
CrossNER: Evaluating Cross-Domain Named Entity Recognition." The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21), pp. 13452-13460.

Machotka, E.(2020).
美術史を超えて──
ヴァナキュラー・マッピングとしての日本近世風景版画.

造形のポエティカ

日本美術史を巡る新たな地平
, ed. Hiromi N. et al. Tokyo: Seikansha.

Moretti, F.(2000). The Slaughterhouse of Literature.
Modern Language Quarterly, 61:1.

Shirane, H.(2013)
Japan and the Culture of the four Seasons: Nature, Literature, and the Arts. Columbia University Press.

Taylor, A. and Tilton, L.(2019). Distant Viewing: Analyzing Large Visual Corpora.
Digital Scholarship in the Humanities, 34:1: i3–i16. 

Yada, T.(2012).
矢田勉.
国語文字・表記史の研究. Tokyo: Kyūko Shoin.

Chatzipanagiotou, M. and Machotka, E. and Pavlopoulos, J.(2021). Automated Recognition of Geographical Named Entities in Titles of Ukiyo-e prints. Digital Humanities Workshop (DHW 2021). Association for Computing Machinery, New York, USA, 70–77.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2022
"Responding to Asian Diversity"

Tokyo, Japan

July 25, 2022 - July 29, 2022

361 works by 945 authors indexed

Held in Tokyo and remote (hybrid) on account of COVID-19

Conference website: https://dh2022.adho.org/

Contributors: Scott B. Weingart, James Cummings

Series: ADHO (16)

Organizers: ADHO