par Thierry POIBEAU - publié le

Two CNRS labor­at­ories spe­cialized in lin­guistics (MODYCO and LATTICE) have organized a summer school on textual data annotation in September 2011. This website gives access to most of the material.

Linguistics and other social sciences are more and more focused on the use of corpora (or at least relevant written or oral documents) to perform different kinds of research. Despite this diversity, common needs exist, especially when it comes to the enrichment of data with annotation (be it syntactic, semantic or pragmatic). These practices have largely been acquired "on the job". This is why it seems important to have a forum to exchange about good practices, gather past experience and make it possible to unify a community and benefit from each other’s feedback.

In the past, thematic summer schools (or similar events) have been organized on the notion of corpus (e.g. Rastier and F. Ballabriga, 2006) or on the statistical analysis of corpus (cf. the summer school on "methods and computer statistical analysis of texts ", held in Besançon in 2009, in Nice in 2010 and in 2011 again in Besançon,

This thematic summer school will focus more specifically on corpus annotation of linguistic data. This event is thus perfectly compatible with the events aforementioned (annotation presupposes the existence of a corpus and can then be analysed by statistical methods). The originality of this summer school is to specifically focus on the notion of annotation, which requires a set of skills and specific practices.