Remediating 20th-Century Magazines of the Arts: Approaches, Methods, Possibilities

Natalia Ermolaev; Clifford E. Wulfman; Hanno Biber; Thomas Crombez

Authorship

1. Natalia Ermolaev

Princeton University
2. Clifford E. Wulfman

Princeton University
3. Hanno Biber

OEAW Österreichische Akademie der Wissenschaften / Austrian Academy of Sciences
4. Thomas Crombez

Royal Academy of Fine Arts Antwerp

Work text

This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

1. Panel Proposal
The remediation of historical books and newspapers into the digital environment receives ample attention in today’s academic and library communities, and has become a core feature of major digital humanities projects that have enabled innovative methods of inquiry and new scholarly discoveries. However, despite the establishment of pioneering digital resources such as the Modernist Journals Project and the Modernist Magazines Project, the design and research potential of electronic collections of modernist historical magazines remains an understudied and under-theorized topic.

This panel aims to fill this gap by presenting a series of approaches, methodologies and possibilities inherent in the creation of robust digital collections of 20-th century magazines of the arts. We bring together designers of three collections – the Blue Mountain Project (Princeton University), the AAC- Austrian Academy Corpus (Austrian Academy of Sciences), and Digital Archive of Belgian Neo-Avant-garde Periodicals (or DABNAP, at the Royal Academy of Fine Arts Antwerp) – for a conversation addressing both the conceptual underpinnings as well as the practical applications of their work. This international panel will continue a lively discussion started at a conference at Princeton University in October 2013 on remediating avant-garde magazines. Our goal at DH 2014 will be to illustrate a variety of avenues available for digital curation of historical magazine collections, and to move toward a set of shared guidelines for representing this rich material in the electronic environment and facilitating advanced research.

A discussion of the theoretical and practical issues involved in remediating modernist and avant- garde periodicals is both important and timely. In recent years, as the field of Periodical Studies has cohered as subset of print culture scholarship, researchers have started looking at historical periodicals in new ways and discovering that they fundamentally challenge our conventional understandings of modern culture. The recent publication of the Oxford Critical and Cultural History of Modernist Magazines, a hefty three-volume set of essays that discusses over 500 magazines from Europe and the Americas, underscores the relevance of periodicals for today’s research on modern and modernist culture. If, as Sean Latham and Robert Scholes assert in “The Rise of Periodical Studies,” the flourishing of this field has been enabled and invigorated by the affordances of the digital environment – what is the response of the Digital Humanities community to the continued evolution of Periodical Studies? What are our obligations as designers of collections of digital historical periodicals? How must we build resources that embody, allow, and promote the sophisticated new avenues of scholarly engagement made possible by electronic tools and platforms?

The panelists’ answers to these questions will reflect the scope, methodology, and focus of their projects: Crombez’s DABNAP is concerned with magazine as a performance and art text, Biber’s paper illustrates the corpus linguistics approach, and Wulfman and Ermolaev maintain a Periodical Studies perspective. Each will touch upon the intellectual and technological insights that emerge when we remediate the arts magazine, such as: representation of aesthetic, material, and social features, questions regarding materiality (format, typography, paper, binding, etc.), tracing printing and distribution history, data modeling, linguistic analysis, semantic enrichment and tagging, entity and name recognition, interface design, and application and tool-building.

This panel promises to stimulate a coherent discussion that will be informative to a broad range of digital humanists. By presenting a cross-section of types of collections, the panel will demonstrate the current state of innovative projects on 20-th century arts magazines, and help forge the way for future work.

All panelists have agreed to participate.

2. Paper 1: Magazines of Magazines. Corpus Research Applications for New Digital Editions of Historical Magazines (Hanno Biber)
A magazine can be regarded as a specific container of texts. A corpus can be regarded as a structured and complex collection of texts. The simple questions of how to structure these complex texts in text corpora and how to integrate magazines in corpora and to make them accessible for research purposes is particularly challenging. This paper will present corpus research applications for innovative digital editions of magazines. Therefore, three research aspects have to be considered. First, the methodological parameters of corpus research have to be determined and the applicability of corpus linguistics for the question of designing and building digital text editions of historical magazines has to be examined. Second, the research potential of scholarly digital editions of magazines in the context of a digital humanities framework has to be described. And third, a practical corpus- based methodological approach of magazine studies has to be given by investigating the specific textual qualities of the magazines and the specific language use in the magazines’ texts, which have been annotated and made accessible with the help of corpus linguistics as well as innovative interface design and graphic design principles. The study will present several aspects of these research questions illustrated by text examples and discuss the methodological implications of such corpus-based investigations into the use of language in texts in particular. Literary magazines in particular have specific properties that can be recognized and registered by means of a corpus-based study of language. Therefore the literary magazines subsection of the “AAC-Austrian Academy Corpus” will be used as examples for this type of research. This resource is an ideal basis for a corpus linguistic exploration into the study of literary magazines. Corpus-based text studies can be regarded also as instruments of textual critique, whereby the corpus-based approach allows various ways of philological research and text analysis.

The framework of the “AAC-Austrian Academy Corpus” offers a research platform for corpus linguistics in a very broad sense as well as for corpus-based magazine studies in particular. So far, two model digital editions of magazines have been developed within the AAC for the purpose of analysing large amounts of literary texts with a methodological approach offered by corpus linguistics. The “AAC-FACKEL”[6] and the “Brenner online”[7] editions are two examples of how corpus research initiatives can help to develop new applications in the field of digital humanities, in particular for language studies in the context of literature research as well as for historical studies and for broader issues related to cultural studies. The “AAC-Austrian Academy Corpus” has been established as a corpus research initiative concerned with exploring electronic text corpora and conducting research in the fields of corpus linguistics, text analysis and digital text corpora. More than 500 million running words of text have been scanned, converted into machine-readable text and carefully annotated with structural mark-up. The texts that have been integrated into the AAC are German language texts of historical and cultural significance. The historical period covered is ranging from 1848 to 1989, a period showing historical changes with remarkable influences on the language and the language use. In the context of the subset of the AAC digital editions the language of satire written by Karl Kraus and his use of language in his magazine is of particular interest for a study on language use and in particular for a study on language use based upon the principles of corpus linguistics. Apart from the digital editions of the “Brenner online” and the “AAC-FACKEL”, which has been annotated to a large extent, the AAC magazines corpus holdings provide a great number of reliable resources and interesting corpus based approaches for investigations into the properties of these texts. Several other magazines have been digitized making use of XML-related standards. Both the „AAC-FACKEL“ and “Brenner online” offer fully searchable online editions of the journal with various indexes and search tools in a web interface, where all pages of the original are available as digital texts and as facsimile images. The „AAC-FACKEL“ and “Brenner online” allow new methods of scholarly research and philological analysis of texts that are of crucial importance for the history and study of not only the language but also of the status and the properties of the magazine. The edition interface of the AAC digital editions has several sophisticated search mechanisms and indexes as well as five individual frames synchronized within one single window. The philological principles of scholarly digital editions within the “AAC-Austrian Academy Corpus” are determined by the conviction that the methods of corpus research enable valuable text resources and research tools for linguists and literary scholars. The resource is an interesting text basis for corpus linguistic explorations whereby a corpus-based approach allows new ways of philological research and analysis of magazines.

3. Paper 2: The Blue Mountain Project and the Language of Avant-Garde Magazines (Wulfman and Ermolaev)
Introduction
This paper initiates a dialog with Hanno Biber’s contribution to this panel by suggesting that a magazine is not only a container of texts but a text itself. As Carolyn Ulrich, one of the authors of the influential book, The Little Magazine: A History and a Bibliography, wrote to her co-author Frederick Hoffman, “a magazine is a tricky individual”: tricky in its identity and tricky in its individuality. What is a magazine? Is it a particular issue (a manifestation at a particular point in time – a copy, an issuance, etc.), or is it a title with some sort of identity over time (expressed through editorial consistency, name continuity, etc.)?

To ask these questions is to engage Biber’s research approach to the magazine – specifically, “the applicability of corpus linguistics [to] the question of designing and building digital text editions of historical magazines.” While the methods of corpus linguistics are, without a doubt, of crucial importance to the study of historical magazines, other disciplinary methodologies can compliment lexical analysis by highlighting additional key aspects of this diverse and rich material.

In their 2010 book, Modernism in the Magazines: An Introduction, Wulfman and Scholes critique the conventional distinctions that have, historically, been applied to magazines, especially generic labels such as “little”, “mass”, “literary”, “avant-garde”, etc. They suggest employing a more analytic approach instead, one which entails identifying a set of characteristics that contribute to our understanding of magazines and then clustering them in different ways, much as linguists characterize speech sounds by clustering features into phonemes. The Blue Mountain Project at Princeton University is continuing this line of inquiry, and in this paper we describe our development of a language of magazines, and of a technological infrastructure and set of applications to represent this complex language in a robust digital resource.

An Ontology of Historical Magazines
The Blue Mountain Project, based in the Princeton University Library and funded by the National Endowment for the Humanities, was launched in 2012 as a freely-available electronic resource for art, music, and literary periodicals published in Europe and the United States between 1850-1923. In the first 2-year grant cycle, Blue Mountain is making 34 titles (approximately 95,000 pages) available in French, German, English, Italian, Spanish, Czech, Russian, Polish, Finnish, and Danish.

In the first part of this paper, we present our research on an ontology of historical magazines – that is, a set of concepts with which knowledge of magazines is represented, expressed in a vocabulary that denotes the types, properties and interrelationships of those concepts – that has become the core intellectual infrastructure of the Blue Mountain Project. This ontology differs from the terms of descriptive bibliography developed in the library community such as MARC (MAchine-Readable Cataloging), and from the critical terminology traditionally used by scholars and critics. The purpose of our ontology is to provide a framework on which researchers and scholars may encode and describe historical magazines.

To be sure, some of the concepts do come from the language of descriptive bibliography: title, editor, contributor, and so forth; but others are more nuanced than descriptive bibliography usually expresses. Issuance, for example, is often a complex phenomenon, especially for magazines published internationally, making the concept of an original copy from which a digital edition is derived a vexed one and requiring scholars to rethink narratives of publication. Other concepts, like circulation and price, are tied to data that are difficult to obtain; still others, like readership, are complex concepts whose sub-concepts must be teased apart and identified before they may be used meaningfully.

Other terms in the Blue Mountain ontology come from the languages of typography, book history, and graphic design, and are inspired by Jerome McGann’s notion of the “bibliographic code.” In order to discuss the relationship of content elements in a magazine, for example – such as the relationship of advertisements to articles – one must have a common language for expressing layout. Simple text transcription of a magazine's contents is insufficient for many kinds of research; thus our ontology is based on an understanding of the historical language of page composition (columns, paragraphs, various forms of headings, publication metadata, and so on) that is vital to the useful encoding of magazine structure and the analysis of a magazine's meaning.

Advanced Applications: Blue Mountaineer and Blue Mountain Springs
In the second part of our paper, we present an experimental architecture for encoding and expressing the language of magazines. The Blue Mountain Project has been designed to support a variety of research uses, beyond the now-standard modes of full-text searching and page browsing. To support those uses, we create high- resolution digital images and provide robust library-standard metadata including title, issue and constituent-level MODS (Metadata Object Description Schema) and METS/ALTO (Metadata Encoding and Transmission Standard/Analyzed Layout and Text Object) records. In our next phase of work, however, we plan to expose this highly structured data for mining and analysis by means of two new modules: Blue Mountaineer and Blue Mountain Springs.

Blue Mountaineer
The Blue Mountaineer is a set of web applications we will design for exploring Blue Mountain content through visualizations, topic modeling, and other forms of data mining and retrieval. It is intended to showcase the power and utility of a rich XML-encoded database while providing researchers and students with tools they can use in their own investigations. Examples of the type of research Blue Mountaineer will enable include the following:

Social Network Analysis: Users will be able to explore the complex publication webs with tools that enable them to plot graphs of relations among titles, authors, artists, languages, and nationalities using Blue Mountain's rich metadata. They will, for example, be able to see all the issues in which Tristan Tzara and Francis Picabia appeared together.

Data-Driven Timelines and Clusters: Blue Mountain will employ off-the-shelf natural-language processing software such as Apache OpenNLP and the Stanford Named Entity Recognizer, as well as Princeton's own WordNet, to perform first-order named-entity detection on its corpus of magazines. The results of these analyses will be encoded in Blue Mountain's TEI transcriptions, which will make it possible for researchers to discover and visualize sequences and relationships buried deep in the textual data itself. They will be able to compare how two authors use a particular term, for example, or see how the work of a particular artist relates to particular advertisers.

Topic Modeling: The aggregation and clustering of content zones encoded in METS/ALTO make it possible for topic modeling algorithms to perform much more fine-grained analysis of magazines and newspapers than they can perform using unzoned, "dirty-OCR'ed" texts. Researchers will be able to study and compare the abstract topics occurring in the work of two authors, for example, or to compare the topics in a magazine's articles with those in its advertising.

Blue Mountain Springs
Information scientists and digital humanities researchers often want to bypass reader-oriented interfaces and access full-text data directly and programmatically for use with external analytic tools. The Blue Mountain Springs module will make Blue Mountain an abundant source of clean data by providing an application programming interface (API) to Blue Mountain's metadata and machine-readable full-text transcriptions. Blue Mountain Springs will support traditional metadata harvesting via OAI-PMH[14], as well as the more elaborate aggregations supported by OAI-ORE[15]. It will enable software clients to access the plain text of Blue Mountain's materials through web-addressable text streams that can be piped directly into visualization and analysis applications.

4. Paper 3: The Document as Event. Analyzing Artist Networks through a Digital Archive of Avant-garde Periodicals (Crombez)
Introduction
In this panel presentation, I would like to highlight the ideas, problems and outcomes of DABNAP, which stands for the ‘Digital Archive of Belgian Neo-Avant-garde Periodicals.’ A considerable collection of forty artist periodicals from the 1950s to the 1980s is being digitized, in order to examine the underlying network of artists and artist groups.

My main interest will be the issue of semantic enrichment of the source documents. To various degrees, semantic enrichment is already common to many scholarly digitization projects. Think, for instance, of marking up personal names and locations in a TEI-encoded document. But for large-scale projects, manual semantic enrichment is often unfeasable. Can automatic procedures, such as named entity recognition (NER), help to mine art periodicals? Furtherwise, can we imagine and develop software that detects not merely names, but also artistic information in vast collections of text?

In order to answer these ambitious questions, I will first develop a new conceptual model (centered on ‘the document as event’), and then highlight the actual context in which this model will be put in practice.

The Document as Event
The archive of back issues from LIFE Magazine (spanning five decades) was a showcase project for Google Books when the initiative started in 2004. Apart from access through browsing or through the ubiquitous search box, the internet company uses basic text analysis to help users navigate the archive. The interface presents a cloud of words and expressions that are characteristic for the issue that the user is currently browsing. For the February 1937 issue, the web archive is happy to inform the user that “Leon Trotsky,” “Reichstag,” “Studebaker,” and “Kleenex” are among the common terms and phrases. But it is obviously blind to the question whether these names belong to people, organizations, places, or brands.

Current digital archives, then, create at least as many problems as they solve. Users prefer to direct their questions to a simple search box. But the limitations of this model for search are obvious to academic users, and have recently led Google itself to introduce the ‘Knowledge Graph’, enriching search results with semantic metadata. The Knowledge Graph or, more generically, semantic enrichment, shows the future direction of digital text collections.

In order to deal with this new conceptual reality behind current and future digitization projects, let me introduce a new conceptual model to think about such document collections.

I would like to conceptualize the document as event, in order to transfer something of the dynamics of the event onto the document, which is commonly conceived of rather statically. A periodical is occasioned by artistic events, such as the publication of new literary works, the exhibition of visual art, or the presentation of a new theatrical performance. However, as a document, it can also be considered to be an ‘event’ in itself. The text functions as a linguistic meeting space for a wide diversity of named entities. This includes names of artists, writers, dramatists, performers, directors and critics; names of museums, galleries, theatres, companies and schools; and titles of books, art works and theatrical productions. In other words, the concept of the document as event serves as a bridge between the literary and art-historical approach to sources as autonomous documents (and hence all too easily viewed as unconnected), and the linguistic approach to sources as text (which all too easily flattens the document).

The DABNAP Collection
The artistic renewal in Belgium since the 1950s, sustained by small groups of artists, led to a first generation of postwar artist periodicals. Titles such as Le surréalisme révolutionnaire, Cobra, De Tafelronde, Het Cahier and Gard Sivik proved decisive for the formation of the Belgian neo-avant-garde in literature and the visual arts.

During the 1960s and the 1970s, happening and socially engaged art (inspired particularly by the Provo movement) took over and gave a new orientation to artist periodicals. Examples include Happening News, Revo, Anar, Milky Way, Total’s, and, on the side of literature, Labris, Yang, Bok, MEP, Heibel, Boemerang, and many others. Finally, the 1970s and 1980s saw the rise of punk-inspired zines, including Force Mental.

The challenges and difficulties of this project lie in dealing with non-standard formats, types of paper, typography, and non-paper inserts. Paper sizes range from the ludicrously large (A2) to the very small (half of A5). Printing techniques include offset, mimeograph, screen printing and photocopy – resulting in extremely diverse kinds of lettering and typography, which often confuses the OCR software that is used to extract text from the scanned pages.

Apart from the technical difficulties of scanning and extracting text from the heterogeneous source documents, further difficulties of the DABNAP project include interface design and handling copyright issues, which will briefly be discussed in the presentation. The main and final difficulty (or rather, ambition) of DABNAP is to process of extracting complex information about artistic events from the text. This will require, first, to expand common procedures for named entity recognition with techniques for recognizing titles and events. In other words, which means are appropriate, and which linguistic tools have to be developed, for the task of recognizing meaningful relationships between names (e.g., that a certain director is the author of a theatre production)?

References
Both large-scale digitization of books (such as Google Books and the Open Content Alliance) as well as small, curated collections (such as ArchBook http://archbook.ischool.utoronto.ca) are transforming reading and research practices in literary studies and book history. Ryan Cordell’s research using data from the Library of Congress “Chronicling America” project is a prime example of new scholarly engagement with historical newspapers (see Cordell’s “Uncovering Reprinting Networks in Nineteenth-Century American Newspapers,” http://www.viraltexts.org).

Modernist Journals Project: http://modjourn.org; Modernist Magazines Project: http://www.modernistmagazines.com.

The 19th century periodical has been more heavily studied; see especially the project, nineteenth-century serials edition (http://www.ncse.ac.uk) and the essays: James Mussell and Suzanne Paylor,“Mapping the ‘Mighty Maze’: The Nineteenth-Century Serials Edition,” Nineteen: Interdisciplinary Studies in Nineteenth- Century Studies, 1 (2005) and Mussell, Paylor, M. Deegan and K. Sutherland, “Editions and Archives: Textual Editing and the Nineteenth-Century Serials Edition (ncse),” in Text Editing, Print, and the Digital World (Farnham: Ashgate, 2009), 137-158.

See Sean Latham and Robert Scholes, “The Rise of Periodical Studies,” PMLA, Vol. 121, No. 2 (Mar., 2006), pp. 517-531. In this article, Latham and Scholes discuss the context of North American academia.

The Oxford Critical and Cultural History of Modernist Magazines: Volume I: Britain and Ireland 1880-1955 (ed. Peter Brooker and Andrew Thacker, Oxford University Press, 2009); Volume II: North America 1894-1960 (ed. Brooker and Thacker, 2012); Volume III: Europe 1880 - 1940 (ed. Peter Brooker, Sascha Bru, Andrew Thacker, Christian Weikop, 2013).

AAC-Austrian Academy Corpus: AAC-FACKEL. Online Version: »Die Fackel. Herausgeber: Karl Kraus, Wien 1899- 1936«. AAC Digital Edition No 1, (Editors-in-chief: Hanno Biber, Evelyn Breiteneder, Heinrich Kabas, Karlheinz Mörth; Graphic Design: Anne Burdick) http://www.aac.ac.at/fackel.

AAC-Austrian Academy Corpus und Brenner-Archiv: BRENNER ONLINE. Online Version: »Der Brenner. Herausgeber: Ludwig Ficker, Innsbruck 1910-1954«. AAC Digital Edition No 2, (Editors-in-chief: Hanno Biber, Evelyn Breiteneder, Heinrich Kabas, Karlheinz Mörth; Graphic Design: Anne Burdick) http://www.aac.ac.at/brenner.

Robert Scholes and Clifford Wulfman, Modernism in the Magazines: An Introduction (New Haven, CT: Yale University Press, 2010)

Modernist Journals Project: http://modjourn.org; Modernist Magazines Project: http://www.modernistmagazines.com.

Jerome McGann, The Textual Condition (Princeton, NJ: Princeton University Press, 1991)

http://opennlp.apache.org

stanford.edu/software/CRF-NER.shtml

http://wordnet.princeton.edu

http://www.openarchives.org/pmh

http://www.openarchives.org/ore

Full text license: This text is republished here with permission from the original rights holder.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ADHO - 2014

"Digital Cultural Empowerment"

Hosted at École Polytechnique Fédérale de Lausanne (EPFL), Université de Lausanne

Lausanne, Switzerland

July 7, 2014 - July 12, 2014

377 works by 898 authors indexed

XML available from https://github.com/elliewix/DHAnalysis (needs to replace plaintext)

Conference website: https://web.archive.org/web/20161227182033/https://dh2014.org/program/

Attendance: 750 delegates according to Nyhan 2016

Series: ADHO (9)

Organizers: ADHO

Remediating 20th-Century Magazines of the Arts: Approaches, Methods, Possibilities

1. Natalia Ermolaev

2. Clifford E. Wulfman

3. Hanno Biber

4. Thomas Crombez

ADHO - 2014

"Digital Cultural Empowerment"