Show simple item record

Depositordc.contributorLamb, William
Funderdc.contributor.otherCarnegie Trust for the Universities of Scotlanden_UK
Funderdc.contributor.otherBòrd na Gàidhligen_UK
Spatial Coveragedc.coverage.spatialOuter Hebridesen_UK
Spatial Coveragedc.coverage.spatialHighlands and Islandsen_UK
Spatial Coveragedc.coverage.spatialUKen
Spatial Coveragedc.coverage.spatialUNITED KINGDOMen
Time Perioddc.coverage.temporalstart=1997; end=2016; scheme=W3C-DTFen
Data Creatordc.creatorLamb, William
Data Creatordc.creatorArbuthnot, Sharon
Data Creatordc.creatorNaismith, Susanna
Data Creatordc.creatorDanso, Samuel
Date Accessioneddc.date.accessioned2016-05-25T15:49:34Z
Date Availabledc.date.available2016-05-25T15:49:34Z
Citationdc.identifier.citationLamb, William; Arbuthnot, Sharon; Naismith, Susanna; Danso, Samuel. (2016). Annotated Reference Corpus of Scottish Gaelic (ARCOSG), 1997-2016 [dataset]. University of Edinburgh. School of Literatures, Languages and Cultures. Celtic and Scottish Studies. https://doi.org/10.7488/ds/1411.en
Persistent Identifierdc.identifier.urihttp://hdl.handle.net/10283/2011
Persistent Identifierdc.identifier.urihttps://doi.org/10.7488/ds/1411
Dataset Description (abstract)dc.description.abstractA representative, tagged corpus of Scottish Gaelic, divided into 8 registers (4 spoken, 4 written) of approximately 10k words each. The corpus is presented as individual txt files. The corpus was hand-tagged by Lamb, Arbuthnot and Naismith and separately verified by them. It uses the Brown format tag separators ('/': e.g. 'agus/Cc') and an annotation scheme derived from the Irish PAROLE tagset (see Uí Dhonnchadha, E. and van Genabith, J. 2006. A Part-of-Speech tagger for Irish using finite state morphology and constraint grammar disambiguation. Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), 2241-2244.). The annotation scheme is described in a PDF included with the data: Lamb, W. and Naismith, S (2014) Scottish Gaelic Part-of-Speech Annotation Guidelines. This work was funded by Bòrd na Gàidhlig and Carnegie Trust for the Universities of Scotland.en_UK
Dataset Description (TOC)dc.description.tableofcontentsThe file 'categories.prn' describes the genre of each text file.en_UK
Languagedc.language.isoglaen_UK
Publisherdc.publisherUniversity of Edinburgh. School of Literatures, Languages and Cultures. Celtic and Scottish Studiesen_UK
Relation (Is Referenced By)dc.relation.isreferencedbyhttp://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.672.813&rep=rep1&type=pdf#page=11en_UK
Rightsdc.rightsCreative Commons Attribution 4.0 International Public Licenseen
Subjectdc.subjectcorpus linguisticsen_UK
Subjectdc.subjectScottish Gaelicen_UK
Subjectdc.subjectnatural language processingen_UK
Subjectdc.subjectNLPen_UK
Subjectdc.subjectpart-of-speech taggeren_UK
Subject Classificationdc.subject.classificationEuropean Languages Literature and related subjectsen_UK
Titledc.titleAnnotated Reference Corpus of Scottish Gaelic (ARCOSG)en_UK
Typedc.typedataseten_UK

Download All
zip file MD5 Checksum: e2944be1f6686186e28502b441621016

Files in this item

Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record