NewsEye: A digital investigator for historical newspapers

poster / demo / art installation
Authorship
  1. 1. Antoine Doucet

    Universite de La Rochelle

  2. 2. Martin Gasteiner

    Universität Wien (University of Vienna)

  3. 3. Mark Granroth-Wilding

    University of Helsinki

  4. 4. Max Kaiser

    Österreichische Nationalbibliothek (Austrian National Library)

  5. 5. Minna Kaukonen

    University of Helsinki

  6. 6. Roger Labahn

    Universität Rostock (University of Rostock)

  7. 7. Jean-Philippe Moreux

    Bibliothèque nationale de France (BnF) (National Library of France)

  8. 8. Guenter Muehlberger

    Universität Innsbruck

  9. 9. Eva Pfanzelter

    Universität Innsbruck

  10. 10. Marie-Eve Therenty

    Université Paul-Valéry Montpellier

  11. 11. Hannu Toivonen

    University of Helsinki

  12. 12. Mikko Tolonen

    University of Helsinki

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

The NewsEye H2020 project, running from May 2018 until April 2021, is an interdisciplinary undertaking that involves 3 European national libraries, 4 humanities and social science research groups and 4 computer science research groups. The core concept of NewsEye is a seamlessly integrated armory of tools and methods that will improve the users’ capability to access, analyze and use the content in the digital libraries of historical newspapers.Figure 1: Beta version of the NewsEye demonstrator (June 2020)Specifically, in the context of historical newspaper written in German, Finnish, Swedish and French, with a focus on the period 1850-1950, the project aims to develop a toolbox consisting of two main layers, as well as novel research results on several topics and in several fields of digital humanities, based on documents in different languages, so as to demonstrate the potential extent of its usefulness as a catalyst for the seamless development of novel research.In details, the first layer of the NewsEye toolbox focuses on tools to improve and enrich historical newspapers, with improved text recognition and article segmentation, followed by semantic enrichment through the recognition and linking of named entities, stance detection, as well as novelty detection. A language-independent set of higher quality data results from this step, already allowing an enriched experience and access to the newspaper collections. The second layer of the toolbox provides ways to benefit from this enriched dataset, through dynamic text analysis tools interacting with respect to user activities: contextualized topic modeling, viewpoint and comparative analysis, etc. In addition, an innovative personal research assistant is able to design strategies (plans) for finding something interesting and to revise them on the fly when needed. It consists of an investigator (dynamically finding and suggesting novel ideas), a reporter (summarizing the grounds for all suggestions) and an explainer (allowing the user to understand the suggestions by herself, and to return to the original data to confirm or infirm them).Within the project, several digital humanities case studies are led, with the aim to guide the development of adequate tools, and so as to demonstrate their potential for the development of novel research in digital humanities. In the case studies, groups of humanities scholars carry out investigations for representative research issues, such as “gender”, “migration”, “nationalism and revolutions”, and “media”. Since there is plenty of existing qualitative research on these topics, the project strives towards making an impact in the fields of historical research and digital humanities by combining knowledge from qualitative analyses with new findings in big data analyses provided by the new tools in this project.Figure 2: Short description of the case study on return migrationIt is essential to understand that the NewsEye research topics and datasets are showcases, and that the seamless inclusion of additional research question is a key ambition. With this in mind, all the tools developed are language-independent, so as to be able to seamlessly integrate further datasets. In fact, NewsEye is both open to further research cases to be studied using its tools and to the integration of additional datasets and tools through the status of associated partner.Acknowledgements:This work has been supported by the European Union Horizon 2020 research and innovation programme under grant 770299 (NewsEye).

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2020
"carrefours / intersections"

Hosted at Carleton University, Université d'Ottawa (University of Ottawa)

Ottawa, Ontario, Canada

July 20, 2020 - July 25, 2020

475 works by 1078 authors indexed

Conference cancelled due to coronavirus. Online conference held at https://hcommons.org/groups/dh2020/. Data for this conference were initially prepared and cleaned by May Ning.

Conference website: https://dh2020.adho.org/

References: https://dh2020.adho.org/abstracts/

Series: ADHO (15)

Organizers: ADHO