The Corpus of Contemporary Irish is a monolingual collection of Irish-language texts in digital format. It consists of edited texts which have been published from the beginning of the 21st century onwards. The corpus currently includes texts from ainm.ie, Beo!, the Cló Iar-Chonnacht archive, the Cois Life archive, Comhar, COMHARÓg, COMHARTaighde, the Éabhlóid archive, Feasta, Irisleabhar Mhaigh Nuad, The Irish Times, the Leabhar Breac archive, the LeabhairCOMHAR archive, Léachtaí Cholm Cille, Léann Teanga: An Reiviú, the Cló Mhaigh Eo archive, Meon Eile, NÓS, Nuacht RTÉ, Scáthán, Seachtain, Studia Hibernica, TEANGA: The Journal of the Irish Association for Applied Linguistics, Tuairisc.ie, and An tUltach. It contains c.29.7 million words.
The corpus was used as an internal terminological resource by Gaois, Fiontar & Scoil na Gaeilge for some time but was made freely available to the public in 2016. Gaois is very grateful to the publishers and copyright holders who have given permission to use their material.
The search interface is very simple. There is a specific search (‘This phrase as is’) and a broad search. Results can be filtered according to collection in the bar to the right.
We intend to expand the content and improve the functionality of the corpus over time. We greatly appreciate feedback from our users and we are particularly interested in hearing from copyright holders who have digital material in Irish that would be suitable for the scope of this corpus. We can be contacted at firstname.lastname@example.org.