The Scottish Corpora project has created large electronic corpora of written and spoken texts for the languages of Scotland, featuring nearly 4.6 million words of text, with audio recordings to accompany many of the spoken texts. Find out more
Keyword: Corpus Linguistics
There are 6 projects listed:
CMSW: Corpus of Modern Scottish Writing (1700-1945)
An electronic corpus of written and printed texts from the period 1700-1945, featuring over 350 documents and containing approximately 5.5 million words of text overall.
Corpas na Gàidhlig
Corpas na Gàidhlig aims to provide a comprehensive electronic corpus of Scottish Gaelic texts for students and researchers of Scottish Gaelic language, literature and culture.
DASG: Digital Archive of Scottish Gaelic / Dachaigh airson Stòras na Gàidhlig
The Digital Archive of Scottish Gaelic is the University of Glasgow's online repository of digitised texts, lexical resources and audio recordings for Scottish Gaelic.
The Linguistic DNA of Modern Western Thought uses digital methods and resources to analyse more than 5 million pages of printed texts. This data represents works printed in English, or in England, Ireland, Scotland, and Wales from 1473 to 1800.
Scots Syntax Atlas
The Scots Syntax Atlas presents the results of over 100,000+ acceptability judgments from over 500 speakers on over 250 morphosyntactic phenomena. The Atlas also contains a text-to-sound aligned corpus of spoken data totalling 275 hours and over 3 million words.
SCOTS: Scottish Corpus Of Texts and Speech
The Scottish Corpora project has created large electronic corpora of written and spoken texts for the languages of Scotland, featuring nearly 4.6 million words of text, with audio recordings to accompany many of the spoken texts.