Ecole Normale Supérieure de Lyon
From KWIC Concordance to Video excerpt or Folio facsimile: Demonstration of Multimodal and Multimedia corpora in TXM
ENS de Lyon, France
Paul Arthur, University of Western Sidney
Locked Bag 1797
Penrith NSW 2751
Converted from a Word document
interface and user experience design
software design and development
concording and indexing
The open-source GPL-licensed TXM software provides a classical toolbox for text analysis and mining composed of a versatile and efficient full text search engine, text reading and browsing, video playing, sub-corpus and partition building, co-occurrence analysis, factorial analysis, and clustering. It is available as a desktop application for Windows, Mac, or Linux, as well as a web portal software for a server accessed through a web browser (Heiden, 2010). The TXM platform can be downloaded for free at http://sf.net/projects/txm.
An originality of TXM is the ability to apply analytic tools on a large spectrum of encoding formats from XML-TEI encoded sources to basic raw text, through a high-level user GUI in a desktop software or as a web portal.
This poster will introduce the recent integration of processing capabilities for two very different modalities of textual data into TXM:
• Written texts associated with their facsimile-scanned images or media files through pagination.
• Speech transcriptions associated with their recordings—video or audio media files—through synchronization.
Each textual modality is managed through the unique pivot XML-TEI TXM source format designed for TXM.
The poster will demonstrate a live session of
• Navigation between KWIC concordances and the synoptic display of text editions with facsimile and critical edition containing the occurrence of pivots in the online TXM portal version (see Figure 1).
• Navigation between KWIC concordances and playing the video excerpts corresponding to the occurrence of pivots in the desktop version (see Figure 2).
Figure 1. Browsing the ‘Quest del saint Graal’ manuscript edition online in a TXM portal. Lower part: A KWIC concordance of the ‘“Lancelot” word followed by a verb’ pattern. Upper part: A synoptic view of the edition of the ‘Queste del saint Graal’ manuscript, composed from left to right of the facsimile image and of three different levels of diplomatic editions, with the sixth concordance hit highlighted in pink in each diplomatic level. The browser used in this screenshot is Firefox. The ‘Queste del saint Graal’ edition can be accessed in a TXM portal at http://txm.bfm-corpus.org/?command=documentation&path=/GRAAL.
Figure 2. Browsing the video and the transcription of dialogs of a physics course in college in TXM desktop version.
Lower part: A KWIC concordance of the ‘“lumière” word followed by a verb’ pattern.
Upper part: A synoptic view of the edition of the transcription of dialogs of a physics course in college (Tiberghien et al., 2012) on the right and the video recording of the course on the left, with the third hit of the concordance highlighted in pink. The desktop TXM used in this scenario is the Linux version.
The demonstration will be based on the desktop version of TXM and on the portal version of TXM.
Heiden, S. (2010). The TXM Platform: Building Open-Source Textual Analysis Software Compatible with the TEI Encoding Scheme. In Otoguro, K. I. R. (ed.),
24th Pacific Asia Conference on Language, Information and Computation, Institute for Digital Enhancement of Cognitive Development, Waseda University, 4–7 November 2010, pp. 389–98.
Tiberghien, A., et al. (2012). Partager un corpus vidéo dans la recherche en éducation: Analyses et regards pluriels dans le cadre du projet ViSA.
Education & Didactique,
6(March): 9–17, www.cairn.info/revue-education-et-didactique-2012-3-page-9.htm
If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.
Hosted at Western Sydney University
June 29, 2015 - July 3, 2015
280 works by 609 authors indexed
Conference website: https://web.archive.org/web/20190121165412/http://dh2015.org/
Series: ADHO (10)