21st century tools for indigenous languages: 2013-2016

Using Plains Cree as the spearhead language, this project will produce tools such as intelligent electronic dictionaries, spell-checkers, language teaching and learning software, and text-to-speech synthesizers to facilitate the use of minority languages in all spheres of life by community members.

These technologies are available for world’s majority languages (e.g. English and French), but have so been created for only a few minority languages such as the Saami languages spoken in Northern Scandinavia.

This project combines the creation of applications with research on language structure and acquisition, through, for example, the monitoring the behavior of learners using the applications. The project also involves the collection and digitization of textual resources into corpora, both as a means of tool-testing and as a research objective of its own.

With the compilation of text collections and the creation linguistic tools for their analysis as well as amassing new experimental evidence on a number of diverse indigenous languages, we will be able to apply the most recent advancements in statistical and computational methods on hitherto understudied material. The study of such extensive new data representing multiple sources of linguistic behavior has the potential to substantially alter how we understand language, by testing the general validity of current linguistic theories and quite possibly revising them.

This project fits into a wider research program seeking to (1) develop computational language technology tools for a wider selection of indigenous languages, starting with Plains Cree, which will benefit both the language communities in question as well as linguistic research in general; (2) research and record multiple modalities and forms of linguistic behavior in these languages using modern psycholinguistic methods; and (3) use the latest computational models to extract the most information out of such data. Consequently, through collecting substantial amounts of multiple distinct types of data for each indigenous language, this project will add to the overall diversity of empirical evidence available for quantitative linguistic analysis, which will allow us to rigorously test contemporary linguistic and psycholinguistic theory.

The project is carried out in collaboration with Giellatekno and Divvun research and development groups at the University of Tromsø, Norway, the Cree Literacy Network, Cree speaking communities in Alberta, as well as a number of Cree scholars, and run by the Alberta Language Technology Laboratory (ALTLab).

Importantly, as invaluable linguistic sources our project is fortunate to be able to make use of and build upon the following three Plains Cree dictionaries: (1) nêhiyawêwin : itwêwina / Cree: Words by Arok Wolvengrey (First Nations University of Canada), (2) the Alberta Elders’ Cree Dictionary (edited by Earle Waugh based on the contributions of numerous Cree Elders), and (3) the Maskwacîs Cree Dictionary (Miyo Wahkohtowin Education).

Past Event: Distinguished Visitor

October 17-25 2013

The Alberta Language Technology Lab, in conjunction with CILLDI, is pleased to welcome Dr. Trond Trosterud as a Distinguished Visitor to the University of Alberta, October 17-25. A long-time advocate for Indigenous language rights, Dr. Trosterud earned his PhD from the University of Tromsø in Norway in 2004, following a successful private-sector career in language engineering. Today, Dr. Trosterud is the Director of Giellatekno, the centre for Saami language technology at the University of Tromsø, and is one of the world’s foremost experts on developing cutting-edge language technology for Indigenous languages.  Beyond his innovative work with the indigenous Saami languages of Scandinavia, Dr. Trosterud also sits on the board of Norwegian Language Council, the National Centre for Nynorsk Education, and the Kvensk Institutt, and has served as the vice president of the Northern European Association for Language Technology.

Dr. Trosterud will be giving several presentations on campus during his time here, as well as offering a hands-on workshop.  Details are provided below.

Schedule of Events

Friday, October 18  –  Theoretical implications of developing working analyzers for inflectional languages. 3:00-4:00 pm, CSC B2.

Monday, October 21 – Friday, October 25  –  Computational morphological models for indigenous languages.  Centre for Comparative Psycholinguistics, Lab 103 (basement of Old Arts Building)

Tuesday, October 22Why do rule-based formalisms work well for commonly-used language technological tools?  2:00-3:00 pm. Athabasca Hall 3-32.

Thursday, October 24Reversing language extinction. How language technology can contribute in the revitalization of the linguistic heritage of the Americas. 5:00-7:00 pm. Telus Centre, Meeting Room 134 and Atrium. Town & Gown – lecture open to the general public and academia.

Friday, October 25“There are no indigenous people here; we all have electric light” — Two centuries of assimilationist policy in the Nordic countries.  2:00-4:00 pm. Senate Chamber, Old Arts Building.

For further information on any of these events, see below, or contact Antti Arppe (arppe@ualberta.ca, 780-492-1935) or the CILLDI office (cilldi@ualberta.ca, 780-248-1179).