This project proposes new approaches to cultural heritage by developing new methods of working with digital texts and by defining appropriate research questions. Our goal is to find ways of turning literature, especially prose fiction, into a site of dynamic research in the humanities and social sciences, rather than merely a passive digital repository.
Our point of departure is the view of cultural heritage as largely intended, or willed, to convey a specific collective memory and identity. This perspective in turn strongly affects the construction of individual identity. From this the project elaborates two main conclusions: 1) In order to fully understand our cultural heritage, it is essential to analyse it against the self-understanding of the cultures that produced it — evading or by-passing the structures of literary canon-formation. 2) Focussing on the issue of identity is an efficient way of developing methodology and performing analysis.
The project is designed to
Benefit from a corpus of specially-prepared material where questions of canon formation can be explored through marginalized and forgotten literary works
Develop new methods of working with the specific forms of cultural heritage embodied in electronic text databases
Develop new perspectives and methods through interdisciplinary exchange and cooperation on these text databases.
Although the primary material of the pilot project is Swedish, all parts of the project are planned to be generalizable, scalable and relevant to other literary traditions. The main material of the investigations consists of three corpora: The literary works of August Strindberg (based on recently-finalized scholarly editions), the literary works of Selma Lagerlöf (all first editions, established and proofread in collaboration with the scholarly edition), and all original Swedish prose fiction that was first published in the years 1800, 1820, 1840, 1860, 1880 and 1900.
These three corpora offer an apposite opportunity to compare and collate results: Strindberg and Lagerlöf are both canonized, fairly contemporary but entirely different authors: one male and intensely occupied with the societal issues of his day, the other female and developing her own kind of “saga” style, interested in social issues but in a more indirect way. While Strindberg and Lagerlöf belong to Sweden’s most renowned and internationally famous authors, the Swedish Prose Fiction database has been constructed in order to evade canonical selection. Comprising all publications that match the criteria, it offers ways into both mainstream works and those that have been entirely marginalised.
As this project arises from the view of culture as an issue of identity, and of cultural heritage as the performative expression of collective memory and identity, the research questions focus on issues of identity: both collective and individual. Since fiction’s main means of portraying problems and ideas is the individual character, the studies start out with the individual in order to reach conclusions also about collective identity. The research questions include issues of identity in connection to ethnicity, society, gender and consumption patterns.
The project thus explores and develops different forms of materials, techniques, methods and co-operations, which are to result in new combinations of quantitative and qualitative analysis. In particular, we aim at refining methods of “distant reading”, as once proposed by Franco Moretti (Moretti, 2005, 2006), into new approaches that focus on content and context (cf. Jockers, 2013). We use the new tool for sub-corpus topic modeling (STM) designed by Peter Leonard (Leonard and Tangherlini, 2013), which makes it possible to extract topics from a particular work and run against larger materials. We also plan to enhance topic modeling further by adding Named Entity Recognition (NER) and sentiment analysis (cf. Liu, 2010, Maas et al., 2011) to existing systems. NER has been refined, adapted and extended in connection with this project in Kokkinakis and Oelke, 2012, Oelke et al. 2012, Kokkinakis and Malm 2011, 2013; cf. Yang et al., 2011.
At the poster presentation, we will demonstrate materials and techniques on lap-tops.
