Identifying intertext susceptible episodes in Middle Dutch Arthurian romance

paper
Authorship
  1. 1. Joris Job Van Zundert

    Dutch Institute for Scientific Information Services

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

Intertextuality in its broader sense considers all culture and cultural rendering as a form of interacting text, where text must be considered as a mental construct or representation rather than a physical appearance, e.g. the characters printed on a certain medium [Graham, 2000]. Although intellectually appealing, such a definition of intertextuality does not have much practical use in the literary study of medieval Dutch romance, one reason being that contemporary cognitive conceptions of intertextuality can of course no longer be 'measured' and are very hard, if at all possible, to infer or assess. But a rather more limited definition of intertextuality as concrete references from one text to another may have practical use as a means to further the examination and understanding of interaction between various forms of literature and literary works in the medieval context. For certain Middle Dutch Arthurian romances (in particular Roman van Walewein, Moriaen and Ridder metter mouwen) such an approach has been used to qualitatively assess aspects of intertextuality [Besamusca, 1993]. It would advance such 'traditional' research considerably if it were possible to identify with computational means text episodes within two or more Middle Dutch texts that show strong semantic or conceptual similarities. Provided that computer assisted methods for pinpointing possible intertextual references would be sufficiently reliable, such technical instrumentation would greatly reduce the resources and time needed to assess large corpora for intertextual references. Moreover such a solution would benefit from explicit inference rules and semantics, enabling philological researchers to approach the assessment of intertextuality in a more quantitative modelled way.

It is the purpose of the proposed paper to describe an as proof of concept implemented solution for the identification of text places in medieval Dutch Arthurian romance that show high likelihood of containing intertextual references to other medieval Dutch texts. Regarding software applications for the computer assisted analysis of intertextual references in Middle Dutch texts, little research and implementation has been done. Furthermore the research conducted up till now is for the most part theoretical in nature. Greco and Shoemaker argue in favour of an a priori rather sophisticated model of intertextuality [Greco, 1993]. In the proposed paper it will be argued that such a model, though theoretically sound, is very hard to implement given the current state of computer assisted text analysis techniques. Consequently, a more elementary model for intertextuality will be applied within the software solution, as to provide a bottom up approach to the problem of modelling intertextuality within the domain of medieval Dutch text. For this purpose, intertextuality will be regarded in its limited form of intertextual reference on a more literal level.

The software solution that will be demonstrated in the paper is based on identifying episodes in different medieval Dutch romances that show considerable semantic similarity, assuming that semantic similarity indicates high probability of referential intertextuality. It will be argued that a vector space modelling approach [Salton, 1975; Manning, 2000] is best suited for this task, because the textual data in question may be considered extremely sparse data from a statistical point of view. Therefore a probabilistic approach would be far less suitable. Before any vector space analysis may be applied though, two other problems arising from the specific nature of the text material will have to be considered and solved. Firstly the orthography of analogous concepts within and among different Middle Dutch texts tends to vary considerably. Because a vector space approach relies on isomorphic input for analogous concepts, this variance has to be adjusted for. Several solutions to this means have been proposed [Braun, 2002]. Both an n-gram and stemming solution and a solution based on the longest common sequence (LCS) will be discussed as an approach to the automated temporarily rewriting of texts in a morphological congruous form suitable for further vector space analysis. Secondly, to enable the comparison of episodes from different texts, all texts to be analysed will have to be broken down into smaller fragments corresponding to these episodes. Such a segmentation must of course be based on well defined structural or narratological heuristics. It will be argued that it's feasible to assume that intertextual references will not expand beyond certain narratological or discourse boundaries. But although such boundaries may be readily identified by a medievalist researcher, the form and structure of medieval Dutch texts hold no or only very few computer interpretable indicators for automated identification of such 'natural' narratological or discourse boundaries. Text structuring initials present in the manuscripts may provide one of the very few computer interpretable indicators of these boundaries. The usefulness of these initials as boundary indicators of narratological episodes will therefore be evaluated as part of the solution proposed.

The latter part of the proposed paper will report on the results from an application of the implemented solution to two medieval Dutch Arthurian romances, the Roman van Walewein and Walewein ende Keye. It's been argued that the latter text holds echoes of the story told in the former. The proposed paper will report on the fragments identified by the application as being possible episodes containing intertextual references. The computer assisted analysis of the two texts will be evaluated by a comparison with a manual assessment of intertextual references in Walewein ende Keye. This evaluation will form the basis of a short discussion that will propose further refinements of the methodology for identifying episodes containing intertextual references in Middle Dutch texts. One of the points that will be argued is the necessity to expand heuristics of the method with a form of conceptual modelling, as intertextual references among two texts will always be only in part similar on a more literal level.

References

1. G. Allen, Intertextuality (The New Critical Idiom). London, 2000.
2. B. Besamusca, Walewein, Moriaen en de Ridder metter mouwen. Intertekstualiteit in drie Middelnederlandse Arturromans. Hilversum, 1993.
3. L. Braun, F. Wiesman, I.G. Sprinkhuizen-Kuyper: "Information retrieval from historical corpora." In: R. de Busser, D. Hiemstra, W. Kraaij, M.F. Moens (eds.), Proceedings of the 3rd Dutch-Belgian Information Retrieval Workshop (DIR). Leuven, 2002, pp. 106-112.
4. G.L. Greco, P. Shoemaker: "Intertextuality and large corpora: a medievalist approach." In: Computers and the Humanities 27 (1993), pp. 349-355.
5. D.F. Johnson, G.H.M. Claassens (eds., transl.), Dutch Romances I: Roman van Walewein. Cambridge, 2000.
6. C.D. Manning, H. Schütze, Foundations of statistical natural language processing. London (UK), 2000.
7. G. Salton, A. Wong, C. Yang: "A vector space model for automatic indexing." In: Communications of the ACM, 18 (1975), pp. 613-620.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ACH/ALLC / ACH/ICCH / ALLC/EADH - 2004

Hosted at Göteborg University (Gothenburg)

Gothenborg, Sweden

June 11, 2004 - June 16, 2004

105 works by 152 authors indexed

Series: ACH/ICCH (24), ALLC/EADH (31), ACH/ALLC (16)

Organizers: ACH, ALLC

Tags
  • Keywords: None
  • Language: English
  • Topics: None