Corpas na Gàidhlig

Corpas na Gàidhlig aims to provide a comprehensive electronic corpus of Scottish Gaelic texts for students and researchers of Scottish Gaelic language, literature and culture.

The 'standard query page' of the Corpas na Gàidhlig
The 'standard query page' of the Corpas na Gàidhlig

Corpas na Gàidhlig is a constituent project of DASG. It was founded in 2008 with the following aims:

  • to create a comprehensive electronic corpus of Scottish Gaelic texts for students and researchers of Scottish Gaelic language, literature and culture
  • to provide the textual basis for the interuniversity project Faclair na Gàidhlig (‘Dictionary of the Scottish Gaelic Language’) upon which the future historical dictionary will be based
  • to provide a resource which will facilitate corpus planning and corpus development technology for Gaelic

The first phase of Corpas na Gàidhlig aims to digitise 340 texts from all periods of Gaelic literature and to include a wide variety of genres, including poetry, prose, song, and folklore. These texts have been prioritised in order to provide part of the textual basis for the interuniversity dictionary project, Faclair na Gàidhlig. It is envisaged as Corpas na Gàidhlig progresses that a broad range of other texts will be added, and in time, that speech will also be represented by text and sound files. In the long term, the Corpus will be used to update the dictionary. To date over 30 million words, mostly Gaelic, have been captured.

Project website: http://dasg.ac.uk/corpus/


Main contact: Roibeard Ó Maolalaigh

Developer: Stephen Barrett

Start year: 2008

Funded by: AHRCBòrd na GàidhligBritish AcademyESRCScottish Funding Council

Subject area: Celtic and Gaelic

Keywords: Corpus LinguisticsGaelic

Record last updated 2019-12-16