21st century tools for indigenous languages

Using Plains Cree as the spearhead language, this project will produce tools such as intelligent electronic dictionaries, spell-checkers, language teaching and learning software, and text-to-speech synthesizers to facilitate the use of minority languages in all spheres of life by community members.

These technologies are available for world’s majority languages (e.g. English and French), but have so been created for only a few minority languages such as the Saami languages spoken in Northern Scandinavia.

This project combines the creation of applications with research on language structure and acquisition, through, for example, the monitoring the behavior of learners using the applications. The project also involves the collection and digitization of textual resources into corpora, both as a means of tool-testing and as a research objective of its own.

With the compilation of text collections and the creation linguistic tools for their analysis as well as amassing new experimental evidence on a number of diverse indigenous languages, we will be able to apply the most recent advancements in statistical and computational methods on hitherto understudied material. The study of such extensive new data representing multiple sources of linguistic behavior has the potential to substantially alter how we understand language, by testing the general validity of current linguistic theories and quite possibly revising them.

This project fits into a wider research program seeking to (1) develop computational language technology tools for a wider selection of indigenous languages, starting with Plains Cree, which will benefit both the language communities in question as well as linguistic research in general; (2) research and record multiple modalities and forms of linguistic behavior in these languages using modern psycholinguistic methods; and (3) use the latest computational models to extract the most information out of such data. Consequently, through collecting substantial amounts of multiple distinct types of data for each indigenous language, this project will add to the overall diversity of empirical evidence available for quantitative linguistic analysis, which will allow us to rigorously test contemporary linguistic and psycholinguistic theory.

The project is carried out in collaboration with the Cree Literacy Network, Cree speaking communities in Alberta, in particular Miyo Wahkohtowin Education (Ermineskin First Nation, Maskwacîs), the Giellatekno and Divvun research and development groups at the University of Tromsø, Norway, as well as a number of Cree scholars, and run by the Alberta Language Technology Laboratory (ALTLab).

Importantly, as invaluable linguistic sources our project is fortunate to be able to make use of and build upon the following three Plains Cree dictionaries: (1) nêhiyawêwin : itwêwina / Cree: Words by Arok Wolvengrey (First Nations University of Canada), (2) the Alberta Elders’ Cree Dictionary (edited by Earle Waugh based on the contributions of numerous Cree Elders), and (3) the Maskwacîs Cree Dictionary (Miyo Wahkohtowin Education).