Laboratoire Lattice - UMR 8094
1 rue Maurice Arnoux, 92120 Montrouge
Directeur de Recherche
CNRS, Directeur adjoint du laboratoire
I am a CNRS Director of Research and the adjunct head of the LATTICE laboratory (Langues, Textes, Traitements informatiques et Cognition). I am also an Affiliated Lecturer at the Department of Theoretical and Applied Linguistics (DTAL) of the University of Cambridge and a Rutherford Visiting fellow cv-poibeau-sept2017-en-2at the Turing Institute (London / Cambridge) for the period 2018–2019. I was the head of Lattice from 2012 to 2018.
From 2003 to 2009, I worked as a CNRS Research Fellow at Laboratoire d’Informatique de Paris-Nord. In 2002-2003, I was an associate professor at the Centre de Recherche en Ingénierie Multilingue (CRIM) within the Institut National des Langues et Civilisations Orientales (INaLCO) and before that a research engineer at Thales Recherche et Technologie (1998-2002).
I mainly work on Natural Language Processing (NLP), especially on the following topics: Information Extraction, Question Answering, Semantic Zoning, Knowledge Acquisition from text and Named Entity tagging. Apart from NLP, my main interests include Language Acquisition, Cognitive Science, Epistemology and the History of Linguistics.
More recently I have been active in two other domains of research.
Digital Humanities is a growing field at the intersection of computational methods and the Humanities. I have recently developed a wide range of activities around this theme at LATTICE and we now have a number of running projects and experiments with various academic partners (esp. the Institut des Systèmes Complexes in Paris, the Médialab at Sciences Po, the Centre for Digital Humanities at UCL in London, etc.). I am also involved in the new Master in Digital Humanities at PSL. See here for more information.
Last but not least, I am especially interested in Finnic (i.e. Finnish and closely related languages) and more generally Uralic languages. We have recently developed multilingual parsing models that have been applied successfully to under-resouced languages like Finnish, Saami and Komi (joint work with KyungTae Lim and Niko Partanen). See here for more information.
A recent CV is available here.
The dream of a universal translation device goes back many decades, long before Douglas Adams’s fictional Babel fish provided this service in The Hitchhiker’s Guide to the Galaxy. Since the advent of computers, research has focused on the design of digital machine translation tools—computer programs capable of automatically translating a text from a source language to a target language. This has become one of the most fundamental tasks of artificial intelligence. This volume in the MIT Press Essential Knowledge series offers a concise, nontechnical overview of the development of machine translation, including the different approaches, evaluation issues, and market potential. The main approaches are presented from a largely historical perspective and in an intuitive manner, allowing the reader to understand the main principles without knowing the mathematical details.
The book begins by discussing problems that must be solved during the development of a machine translation system and offering a brief overview of the evolution of the field. It then takes up the history of machine translation in more detail, describing its pre-digital beginnings, rule-based approaches, the 1966 ALPAC (Automatic Language Processing Advisory Committee) report and its consequences, the advent of parallel corpora, the example-based paradigm, the statistical paradigm, the segment-based approach, the introduction of more linguistic knowledge into the systems, and the latest approaches based on deep learning. Finally, it considers evaluation challenges and the commercial status of the field, including activities by such major players as Google and Systran.
Teaching and lecturing
I am regularly lecturing in various institutions
- Natural Language Processing for Digital Humanities, Master in Digital Humanities, at Paris Sciences et Lettres (PSL)
- Information extraction, at INALCO
- Computational and corpus linguistics at the University of Cambridge
From 2012 to 2014, I was part of a joint team with ITEM exploring linguistic methods for the analysis of the genesis of literary work.
Since 2015, I am part of a new graduate level programme in Digital Humanities held jointly at the Ecole normale supérieure and at Paris Sciences et Lettres.
I am the PI of two funded projects:
- a European project called ATLANTIS (Artificial Language understanding In Robots). ATLANTIS attempts to understand and model the very first stages in grounded language learning, and will propose models and implementations for a robotic environment
- a project called LAKME (Linguistically Annotated Corpora Using Machine Learning Techniques) funded by PSL. Lakme will explore new techniques for the annotation of textual corpora of morphology rich languages (Medieval French, Rabbinic Hebrew and diverse Finno-Ugric languages)
I also participate in:
- a project called DEMOCRAT funded by ANR (Agence Nationale de Recherche). The project will explore techniques describing and modelling reference chains: including diachronic and comparative language studies thanks to automatic annotation techniques. The PI of this project is Frédéric Landragin
I am also the PI of a small scale research collaboration with the UCL Centre for Digital Humanities at University College London. I collaborate with the Language Technology Lab of the University of Cambridge on these topics (I coordinate Digital Humanities at LTL).
Past PhD students