Participatory language documentation around the globe for research in comparative linguistics



What language features are found where, when, why?

Research in comparative linguistics requires large amounts of first-hand data from a varied sample of languages around the world. The ideal way to collect natural data is by involving the speech communities in as many ways as possible.

Comparative linguists study the distribution of linguistic features in the world‘s language. Researchers strive to find explanations why certain linguistic features are found where, how these features change across time, and what they tell us about the evolution and development of human language.

The raw material of comparative linguistics research is data collected in a large number of different languages from all corners of the world. 

The raw material of typological linguistic research is data collected in a large number of different languages from all corners of the world. Of the roughly 7000 languages spoken today, some 50% are in danger of becoming extinct in the next 100 years. This makes the task of collecting language data an urgent first priority in linguistic fieldwork. As language is always part and medium of a specific culture, it is important to record as much of the cultural, social, and environmental setting of the speech community as possible. This makes the active participation of the native speakers crucial. Besides collaborating as language consultants, members of the speech community are actively involved in collecting and recording narratives, conversations, rituals, activities of daily life, songs, species, and so on. They assist in transcribing and analyzing the recordings, as well as in adding complementary information such as illustrations, missing words and definitions, and sociological background data which the outsider wouldn't be able to access.

The linguistic researcher in turn can assist the speech community in preserving their native language in many ways. The collected data can be made available in a format that is useful to the speech community, for example as a dictionary, grammatical descriptions, and collections of stories, which can be used in local education to pass on the language to the next generation. This in many cases involves the development of a suitable orthography for formerly unwritten languages. Online resources such as illustrated encyclopedias can be easily compiled and published at low cost, and actively involving the native speakers. Working on their language together with outsider researchers in many cases leads to a greater appreciation and understanding of the mother tongue, helping in the long-term preservation of endangered languages.

The collection of natural linguistic data necessarily actively involves the collaboration of the speech community in data collection and analysis. Grammar analyses can typically only succeed thanks to repeated discussions with speakers on which expressions and sentences are well formed, and native speaker intuitions are an important guide for hypothesis formation on specific analyses. Speakers also play a key role in the preparation of the material for use in the community. This often leads to a greater appreciation of the value of their native tongue.

Example finding:
Standard theory predicts that words have fixed internal order (e.g. 'chang-ed' cannot be reorderd as 'ed-change') except when this is tied to meaning differences ('sub-inter-national' vs 'inter-sub-national'). During fieldwork on Chintang, native speaker insisted that reordering is possible without any meaning difference. The intuition was met with great skepticism but subsequent statistical analyses proved the intuition right and led to a revision of theory.


Project members:

Department of Comparative Linguistics, UZH

Balthasar Bickel, Rik van Gijn, Steven Moran, Manuel Widmer, Mathias Jenny, Sebastian Sauppe, André Müller, Rachel Weymuth


Graphic artists:

Lisa Senn und Michael Koller