On the Road to Freedom: Network models of interviews of Czechoslovak respondents in the optics of audio-emotional analysis and computational linguistics.

Anatoly Vladimirovich Iashchenko

Authorship

1. Anatoly Vladimirovich Iashchenko

Universita, Roma, Itali

Work text

This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

This Today, relations between Russia, on the one hand, and the Czech Republic and Slovakia, on the other, are rather problematic both politically and socio-culturally. Most of the disagreements arise against a backdrop of reinterpreted history, where any interpretation of domestic political events becomes an occasion for international scandals on both sides (I.M. Savelyeva, 2004; Hradilek, 2010).
The source for the study was an archive of video interviews from the Czech Institute for the Study of Totalitarian Regimes "Memory and History of Totalitarian Regimes".
In order to build network models and conduct comparative analysis between models derived by mixed methods of computational linguistics and emotional analysis of audio recordings, 45 transcribed video interviews were selected from the source database.
The transcribed interviews were analysed in the MAXQDA 2020 PRO software. In the first step of the analysis of the interviews, codes were generated to break down the text. It is the cluster approach of breaking down the text into valences, meaning groups, that allows the linguistic analysis to be carried out so quickly and with such a high quality. From the first minutes of working with dissidents' memories, it becomes possible to study the field of historical consciousness.
Once the coding of the corpus of texts according to the results of the frequency analysis was completed, it became possible to create matrices of the frequencies of the coded keys in the text.
The next step in the interview process was to analyse the texts using the 'interactive word tree' method. This approach allowed us to identify "noise" and cleanse the matrix from unnecessary word combinations. Further analysis of the texts was carried out using content analysis tools: document portrait, document comparison table and code layout.
The resulting frequency tables and matrices for each interview were converted to CSV format and exported to the Gephi database, where network models of keywords in the form of graphs were constructed.
Speech cues are a natural way of human communication, involving both direct linguistic content (e.g., texts) and implicit paralinguistic information (e.g., speaker emotions). Although visual emotion identification is more advanced in modern science, emotional audio analysis is gaining popularity among researchers due to ever-improving algorithms and increasing accuracy.
The construction of network models of Czechoslovak dissidents' audio interviews based on emotional analysis allows not only a new perspective on computer analysis and visualisation methods as a researcher's tool, but also a wider disclosure of information potential of the sources under study. An important feature of this approach is the exclusion of the influence of emotional information of words.
A frequency classification table based on a two-dimensional vector model of emotion was created to conduct emotional analysis of audio interviews and subsequent construction of network models. Pyhton script was written and Librosa audio analysis package was used to investigate the acoustic features of the interviews. The audio files were analysed: fundamental frequencies (f0), spectral characteristics, chromaticity, amplitude, harmonic characteristics, MFCC, logarithmic spectra, etc. Also, to extract markers of vocal emotion from speech (Ma, Y., et al., 2019), logarithmic spectrum conversion to Mel scale followed by discrete cosine transformation was performed (Scherer, K.R., 2003).
The data were classified based on the audio recording spectra on a logarithmic scale for nine key emotions: sadness, fear, anger, frustration, excitement, disgust, happiness, surprise and neutral calm.
In the study, the first time the matrices of emotional models were constructed, an accuracy of 42% was achieved. In further analysis, difficulties were found in interpreting the emotions sadness and fear in female interviewees due to the lack of an extensive sound base of emotions in Czech. The algorithm was modified to incorporate expert judgement, and the accuracy of determination increased by more than 15%.
The tables of emotion analysis and emotion frequency were also translated into CSV files and uploaded to the Gephi software database to build network models in the form of graphs.
Results of comparative analysis of online models of audio-interviews and transcribed texts gave different representations of the same events, allowing the researcher, on the one hand, to expand the information potential of the source, and on the other hand, to interpret personal experiences and experiences of interviewees within a constructive group memory about the fall of the communist regime in Czechoslovakia.
The proposed methodology of analysis, based on the comparison of network models, allows not only a broader disclosure of the information potential of historical sources, but also provides a basis for modelling and studying the mechanisms of transmission of group memory.

Bibliography
I.M. Savelyeva, A.V. Poletaev. Social perceptions of the past: the types and mechanisms of formation. Preprint WP6/2004/07. - Moscow: State University of Higher School of Economics, 2004. 56 с.
Hradilek, Adam (ed.): Za vaši a naši svobodu. Torst, ÚSTR, Praha 2010.
Ma, Y.; Hao, Y.; Chen, M.; Chen, J.; Lu, P.; Košir, A. Audio-Visual Emotion Fusion (AVEF): A Deep Efficient Weighted Approach. Inf. Fusion 2019, 46, 184–192.
Scherer, K.R. Vocal Communication of Emotion: A Review of Research Paradigms. Speech Commun. 2003, 40, 227–256.

Full text license: This text is republished here with permission from the original rights holder.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2022

"Responding to Asian Diversity"

Tokyo, Japan

July 25, 2022 - July 29, 2022

361 works by 945 authors indexed

Held in Tokyo and remote (hybrid) on account of COVID-19

Conference website: https://dh2022.adho.org/

Contributors: Scott B. Weingart, James Cummings

Series: ADHO (16)

Organizers: ADHO

On the Road to Freedom: Network models of interviews of Czechoslovak respondents in the optics of audio-emotional analysis and computational linguistics.

1. Anatoly Vladimirovich Iashchenko

ADHO - 2022

"Responding to Asian Diversity"