Information regarding data sharing

The Gaois research group began collecting texts for the Corpus of Contemporary Irish in 2015. In order to help compile the corpus and carry out research on contemporary Irish, several publishers shared Irish texts with the research team in Fiontar & Scoil na Gaeilge, DCU.

It was agreed then that Gaois would have permission to provide the public with access to the content of these texts, but that this access would be limited to 100 words of text. The corpus is designed to facilitate this condition.

The authors or publishers hold the copyright to each text in the Corpus of Contemporary Irish. Gaois does not have permission to share any text in its entirety with any other person or party.

Several data sets created using the corpus (e.g. frequency lists, language models etc.) will be made available through in the future. Gaois carried out considerable work on these new products, which differ from the original texts. They are newly created products by Gaois, and it is on that basis that Gaois has permission to share them with the public.

Each data set Gaois shares with the public will be licensed under CC BY 4.0. More information regarding the conditions of this licence can be found here.