Eric Villemonte de la Clergerie
Researcher, member of new Team Alpage
Former Scientific leader of Project-team Atoll
Phone: +33 1 39 63 54 10 (soon January 2016: +33 1 80 49 42 68)
Fax: +33 1 39 63 53 30
Warning As you may see, this personal
page is largely outdated (more or less since 2007) ! But promised,
someday I will find the time and the energy to update it ... but in
meantime please check
on FRMG Wiki.
My main area of research deals with the use of tabulation techniques in
Computational Linguistics and Logic Programming, relying on the
notions of Push-Down Automata and
Dynamic Programming. I wish to
show that many linguistic formalisms may benefit from well defined and
understood tabulation mechanisms, for instance in the case of Tree Adjoining Grammars (TAG), and, more
generally, of Midly Context Sensitive (MCS) formalisms using
Thread Automata. This work is completed by the development of
Since 2003, I became more and more interested by the notion of
Meta-Grammar that provides an abstract layer of syntactic
description based on hierachies of classes grouping elementary
constraints and providing/requiring resources. These classes may be
``compiled'' to get a grammar, in some target formalism, such as TAGs
or LFGs. This interest has led to the development of a MG compiler
MGCOMP and of FRMG, a French MG that
uses factorization operators provided by DyALog to produce a very
compact TAG. New More information
about FRMG may be found
on FRMG Wiki.
I am also interested in developing infrastructures to help natural
language processing (NLP). One way is to use nice representation
formats based on XML to encode linguistics resources (grammars,
meta-grammars, shared derivation forests, shared dependency forests,
morpho-syntactic annotations). In relation with this line of work, I
became involved in an international effort driven by ISO TC37 SC4 to
standardize linguistic resources.
I am also involved in information extraction and knowledge
acquisition applications, starting from parsing output. These
thematics include the exploration of error mining and correction
techniques and the processing of botanical corpora.
- Development of system DyALog
- Development of MGCOMP, a Meta-Grammar compiler
and of FRMG, a French Meta-Grammar used to produce a
factorized TAG/TIG grammar that is been tested during the Parsing
evaluation campaign EASy
- Development of several scripts in the context of ATOLL's NLP
- Coordination of ATOLL's development of various NLP tools.
- Research projects
- Leader of ANR MDCA Passage
(2007--2009) about parsing very large corpora, using the results to
run evaluations and lexical knowledge acquisition tasks.
- INRIA ARC
MOSAIQUE (2006 -- 2007) about the design and use of high level
syntactic formalisms, such as Meta-Grammars.
- French Action LexSynt (2005 -- 2006) on the design and constitution of a
French syntactic lexicon
- French Technolangue action EASy (2003 --
2005) on a French parsing evaluation campaign, with the
participation of 2 parsers developed by ATOLL, one of them developed
using DyALog system
- French Technolangue Project Normalangue (2003 -- 2005) on
the standardization of linguistic resources, in coordination with
ISO sub-committee TC37 SC4 on ``Language Resources Management''. In
particular, I am project leader on a proposal of ``Morpho-syntactic Annotation Framework'' (MAF) and
expert on a proposal on ``Feature Structure Representation'' (FSR).
ACI Biotim Text and linguistic processing of botanic corpus.
- INRIA Action GENI (2002 -- 2003) on Generation and Inference
- Leader of ARC RLT (2001 --
2002) about the acquisition of linguistic resources for TAGs.
- TermIT (closed on
1999) An European-funded feasibility study on the notion of
multi-lingual thesaurus in the cultural area. Report
- Member of the editorial board of French journal T.A.L
- Guest editor in 2004 of a special issue of journal T.A.L. on
``Evolutions in Parsing'';
- Organizing Committees
- Program Committees
- Reviewer for ACL 2007, Prague, Czech Republic, June 23-30 2007.
- Reviewer for EMNLP-CoNLL-2007, Prague, Czech Republic, June 28-30, 2007.
- IWPT'07, Prague, Czech Republic,
ICLP'07, 8-13 September, Porto, Portugal
- Lexis and Grammar Conference 2007, Bonifacio,
Corsica, on October 2-6, 2007
CSLP'07, Roskilde University, Denmark, 20-24 August 2007
- TALN'07 ,
Toulouse, June 5-8, 2007
- TALN'07 workshop on "High Level Syntactic Formalisms".
(New-York, USA, June 2006)
(Leuven, Belgium, April 2006)
(Sydney, Australia, July 2006)
(Sydney, Australia, July 2006)
- Journée ATALA on Interface lexique-grammaire et lexiques syntaxiques
et sémantiques (Paris, March 2005).
- EPIA'05 Workshop on Text Mining and Applications (TEMA),
Corveha (Portugal, December)
- IWPT'2005 (Vancouver, October 2005).
- LREC'04 workshop on a registry of Linguistic Data Categories within an
Integrated Language Resource Repository Area (Lisbon, May
- IWPT'2003 (Nancy).
- IWPT'2001 (Pekin). See some pictures of the workshop.
- TAG+5 (Paris)
- TAPD'2000 (Vigo, Espagne) and TAPD'98 (Paris)
- Invited Speaker at SEPLN'2000 (Vigo,
- Working groups
- TAGML on Tree Adjoining Grammars [TAG] and their representation using XML.
- A3CTE about applications mixing knowledge
acquisition and Natural Language Processing