VedaWeb 2.0: Towards a Collaborative Workspace for Indo-Aryan Texts:

paper, specified "short paper"
  1. 1. Claes Neuefeind

    Universität zu Köln (University of Cologne)

  2. 2. Daniel Kölligan

    Julius-Maximilians Universität Würzburg (Julius Maximilian University of Wurzburg)

  3. 3. Uta Reinöhl

    Albert-Ludwigs-Universität Freiburg (University of Freiburg)

  4. 4. Patrick Sahle

    Bergische Universität Wuppertal

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

VedaWeb, developed in the course of a project funded by the German Research Foundation (DFG) from 2017-2021, is a web-based platform for linguistic and philological work with Old Indo-Aryan texts (
). In this contribution, we report on the key features of the VedaWeb platform and give a prospect on the recently started follow-up project VedaWeb 2.0 (2022-2025), which aims at transforming VedaWeb from a locally developed and curated platform into a collaborative workspace which will be shaped by the needs of the community of Indo-Aryanist researchers as a whole.

The VedaWeb project
The VedaWeb platform was developed on the basis of the Rigveda (about 160.000 words) as the pilot text, supplemented by multiple translations and linguistic annotations, lemma- as well as word form-based links to dictionary entries, alongside a tailored search engine based on elasticsearch.

The Rigveda was annotated by researchers of the University of Zurich with rich, albeit partially incomplete morphological glosses. Gaps in the annotations were filled in and existing ones stream-lined by the VedaWeb project team. Moreover, novel information was added (e.g. morphological annotations of verb classes, verbal stem formations such as desideratives, nominal forms such as comparatives and superlatives, word classes such as local particles, etc). Additional translations, metrical information, and access to APIs for Sanskrit dictionaries (Mondaca et al. 2019a/2019b)

were integrated. All data layers have been modeled and encoded in TEI-P5

and a state-of-the-art web application

has been developed for accessing data offered via APIs.
See also Kiss et al. (2019) for reflections on collaboration within and beyond the project team.

There are several web portals making ancient Indo-Aryan texts available such as TITUS (Thesaurus Indogermanischer Text- und Sprachmaterialien, University of Frankfurt

), GRETIL (Göttingen Register of Electronic Texts in Indian Languages, University of Göttingen

), DCS (Digital Corpus of Sanskrit, University of Heidelberg

) and SARIT (Search and Retrieval of Indic Texts

). By providing a combination of several text versions, translations, linguistic analyses and combined searchability across morphological, syntactic, lexical and metrical features, VedaWeb by far exceeds the scope of existing platforms and has already significantly improved the digital presence of Old Indo-Aryan texts.

VedaWeb 2.0
Primary goal of the follow-up project is to develop the VedaWeb platform into a collaborative workspace to open it up to more texts and to further enrichment through annotations and analyses. Particularly in view of the widespread traditional practices of work in this field that hinder local and international collaboration (e.g., excel files, paper, or similar approaches), the further development of VedaWeb promises substantial advances for anyone working with Indo-Aryan texts, including linguists, philologists, cultural anthropologists, and researchers in philosophical and religious studies.
Furthermore, we will add several new textual and lexical resources. This includes the Śaunaka version of the Atharvaveda, the second oldest Vedic text, and the early Vedic prose text Aitareya Brahmana. These texts and their nearly complete morphological annotation will be provided by our project partners at the University of Zurich (Paul Widmer and Oliver Hellwig). Our cooperation partner J.S. Kim (Würzburg) will provide an
index verborum to the Atharvaveda.

In addition, parts of the Maitrayani Samhita, Jaiminiya Brahmana, and the Śatapatha Brahmana will be added. As for lexical resources, we will integrate several additional dictionaries that are part of the Cologne Digital Sanskrit Dictionaries.
These have been modeled in TEI-P5 and OntoLex-Lemon (Mondaca/Rau 2020), and are available via the C-SALT APIs for Sanskrit Dictionaries.

Additionally, we will include audio recordings of recitations of the Rigveda and Atharvaveda, made available through the National Library of Denmark.

These recitations stand in the tradition of the uninterrupted oral transmission of the early sacred texts of Hinduism since their conception roughly three millennia ago. Supported by the Language Archive Cologne

, the recordings will be segmented and made available for online listening, aligned with the corresponding hymns, stanzas and verses.

VedaWeb is already a well-established platform for linguistic and philological work with Old Indo-Aryan texts. VedaWeb 2.0 will further boost research on Old Indo-Aryan languages in linguistics, philology, and beyond. As a collaborative research platform, VedaWeb 2.0 will integrate several additional Indo-Aryan texts which can be enriched by users with diverse types of annotations, multiple translations, and links to dictionary entries. Data layers in all possible combinations and in a variety of formats will be exportable. In addition, a powerful search engine will allow combined queries across all data layers, including similarity searches. Above and beyond the already sizable number of users of VedaWeb 1.0, we expect a further increase in participants working and networking on the platform.


Kiss, B., Kölligan, D., Mondaca, F., Neuefeind, C., Reinöhl, U., Sahle, P. (2019):
It Takes a Village: Co-developing VedaWeb, a Digital Research Platform for Old Indo-Aryan Texts. In: Steven Krauwer und Darja Fišer (eds),
TwinTalks at DHN 2019 – Understanding Collaboration in Digital Humanities. Kopenhagen, 2019.

Mondaca, F., Rau, F., Neuefeind, C., Kiss, B., Kölligan, D., Reinöhl, U., Sahle, P. (2019a):
C-SALT APIs - Connecting and Exposing Heterogeneous Language Resources. In: Book of Abstracts of the Digital Humanities Conference 2019 (DH2019) 09.07-12.07.2019. Utrecht, Netherlands.

Mondaca, F., Schildkamp, P., Rau, F. (2019b):
Introducing Kosh, a Framework for Creating and Maintaining APIs for Lexical Data. In: Electronic Lexicography in the 21st Century. Proceedings of the eLex 2019 Conference, Sintra, Portugal. Brno: Lexical Computing CZ, s.r.o., pp. 907–921.

Mondaca, F. and Rau, F. (2020):
Transforming the Cologne Digital Sanskrit Dictionaries into Ontolex-Lemon. Proceedings of the 7th Workshop on Linked Data in Linguistics: Building Tools and Infrastructure at LREC 2020. 11–16 May 2020. Marseille, France, pp. 11-14.

Reinöhl, U., Kölligan, D., Kiss, B., Mondaca, F., Neuefeind, C., Sahle, P. (2018):
VedaWeb – eine webbasierte Plattform für die Erforschung altindischer Texte. In: Book of Abstracts der 5. Jahrestagung der Digital Humanities im deutschsprachigen Raum (DHd 2018), Köln 26.2.–2.3.2018, pp. 485–486.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2022
"Responding to Asian Diversity"

Tokyo, Japan

July 25, 2022 - July 29, 2022

361 works by 945 authors indexed

Held in Tokyo and remote (hybrid) on account of COVID-19

Conference website:

Contributors: Scott B. Weingart, James Cummings

Series: ADHO (16)

Organizers: ADHO