Corpas na Gàidhlig aims to provide a comprehensive electronic corpus of Scottish Gaelic texts for students and researchers of Scottish Gaelic language, literature and culture.
Corpas na Gàidhlig is a constituent project of DASG. It was founded in 2008 with the following aims:
- to create a comprehensive electronic corpus of Scottish Gaelic texts for students and researchers of Scottish Gaelic language, literature and culture
- to provide the textual basis for the interuniversity project Faclair na Gàidhlig (‘Dictionary of the Scottish Gaelic Language’) upon which the future historical dictionary will be based
- to provide a resource which will facilitate corpus planning and corpus development technology for Gaelic
The first phase of Corpas na Gàidhlig aims to digitise 340 texts from all periods of Gaelic literature and to include a wide variety of genres, including poetry, prose, song, and folklore. These texts have been prioritised in order to provide part of the textual basis for the interuniversity dictionary project, Faclair na Gàidhlig. It is envisaged as Corpas na Gàidhlig progresses that a broad range of other texts will be added, and in time, that speech will also be represented by text and sound files. In the long term, the Corpus will be used to update the dictionary. To date over 30 million words, mostly Gaelic, have been captured.