gabmap - A Web Application for Measuring and Visualizing Distances Between Language Varieties

workshop / tutorial
Work text
gabmap – A Web Application for Measuring and Visualizing Distances Between Language Varieties
We frequently ask in linguistics, especially in dialectology and comparative linguistics, how similar linguistic varieties are to one another, effectively asking how similar linguistic culture is from one site to another. We operationalize the question more specifically by asking e.g. how similar the vocabulary of one variety is to another, or more interestingly how similar the pronunciations of a set of varieties are, sampled via the pronunciations of the same set of at least 30 words at a range of sites. Since there may be thousands of words and hundreds of sites, the questions must be addressed computationally. The techniques embodied in the web application have been used in dozens of scholarly papers on dialectology (see references).

At the University of Groningen the gabmap application has been developed that is capable of measuring differences in linguistic samples, including in particular sets of phonetic (or phonemic) transcriptions, to project present the results graphically onto maps. Gabmap is a graphical user interface that implements not only the comparison of vocabulary or other categorical data (essentially as percentage overlap or percentage difference) but also that of pronunciations via edit distance. Because the software is implemented as a web application users are not required to download it nor to keep it up to date by following releases. It is fairly user friendly and easily accessible and therefore enables experimentation with different techniques popular among linguists from various fields, especially dialectology and variationist linguistics.

During the workshop we will give some theoretical background about dialectometry followed by a tutorial where the theory is put into practice with exercises showing how to use the web-application. We have given similar courses in dialectology previously, for example during the Linguistic Society of America Linguistics Institute in 2005 at MIT and to the special meeting of the Forum Sprachvariation of the Internationale Gesellschaft füt deutsche Dialektologie in Erlangen in Oct. 2010 ( The workshop proposed here will be like the second in that it will include hands-on sessions.

The workshop will be structured as follows:

Introduction to dialectometry
Data entry: uploading dialect data, creating and uploading maps
Data inspection: data distribution and error detection
Measuring linguistic distances
Graphical presentations of linguistic distances: dialect maps
Statistical analyses: multidimensional scaling and clustering
Data mining, identifying influential individual variables (words, pronunciation variants)
We have named the gabmap collaborators as co-authors of the tutorial, but only Nerbonne and maximally one other will offer the tutorial. We can accommodate up to 20 participants.

We add a note to potential participants from non-linguistic fields. In theory one might ask the same questions of non-linguistic culture that we ask of linguistic culture, namely to what degree is e.g. the material culture of one settlement similar to that of another. We suspect that one might attack the non-linguistic question using techniques similar to the ones we will demonstrate during this tutorial, i.e. one might gather question as, but the point is purely theoretical so far, although we would welcome the chance to examine the question in a data-intensive way. If such studies are carried out, we suspect that at least the mapping facilities we demonstrate in this tutorial will be useful.

