Benoît Sagot

Inria Researcher
in Computational Linguistics and Natural Language Processing
Head of the ALMAnaCH research team (Inria/ÉPHÉ)


Current areas of interest and research domains
  • “Classical” and computational etymology (Indo-European…)
  • Computational historical linguistics (Indo-European, French…)
  • Development of lexical resources (morphological, syntactic, semantic, etymological), for French and other languages
  • Morphosyntactic analysis (part-of-speech tagging) and parsing (statistical, neural and hybrid approaches)
  • Computational and quantitative morphology
  • Raw corpus processing, especially noisy user-generated corpus found on the web
  • Formal grammars
  • Applications of NLP (opinion mining, chatbots…)

Tools and Resources

Lefff

Lefff

Morphological and syntactic lexicon for French

Alexina

Alexina

Morphological (and sometimes syntactic) lexicons other than Lefff

UDLexicons

UDLexicons

Morphological lexicons in the CoNLL-UL format

Etymology

EtymDB

Etymological database extracted from wiktionary

WOLF

WOLF

Free Wordnet for French

MElt

MElt

Part-of-speech tagger

SxPipe

SxPipe

Shallow language processing chain

Publications

Only the most recent publications are listed below.

Projects

Ongoing projects

  • ANR ParSiTi (2016-2021): PI: D. Seddah. Other participants: LIMSI, LIPN. Topic: parsing and machine translation for user-generated content using contextual information.
  • ANR SoSweet (2015-2019): PI: J.-P. Magué. Resp. for ALMAnaCH: D. Seddah. Other participants: ICAR (ENS Lyon, CRNS), Dante (Inria). Topic: studying the sociolinguistic variation on Twitter, by comparing linguistic/NLP and graph-based approaches.
  • ANR Profiterole (2016-2020): PI: Sophie Prévost (LATTICE). Topic: modelling and analysis of Medieval French and its evolution.

Past projects

  • ANR EDyLex (project PI) — Dynamic extension of lexical resources. Other participants: LIF (Marseilles), LIMSI, AFP, Vecsys Research, Syllabs
  • ANR Séquoïa (in charge for ALPAGE). PI: A. Nasr. Topic: probabilistic parsers for French. Main participant besides Alpage: (LIF) Marseille
  • ANRPerGram. PI: Pollet Samvelian. Topic: Linguistic description and HPSG implementation of Persian syntax.
  • SCRIBO (“pôle de compétitivité” System@tic). Topic: Semi-automatic and Collaborative Retrieval of Information Based on Ontologies
  • ANR Passage. PI: É. de La Clergerie. Topic: automatic construction of a very large syntactically annotated corpus by merging the annotations produced by several parsers; linguistic information extraction from this corpus. Other participatns: LIMSI, CEA, ELDA/ELRA
  • ANR Rhapsodie
  • ARC INRIA Mosaïque: formalismes syntaxiques de haut niveau
  • Projet ILF LexSynt. PI: S. Kahane. Topic: syntactic lexicons.
  • EASy Technolangue project . Topic: evaluation of parser for French.

Curriculum Vitæ

You can download here a reasonably recent version of my CV, from which personal information has been removed.

Contact

Postal address
Inria Paris (équipe ALMAnaCH)
2 rue Simone Iff
CS 42112
75589 Paris Cedex 12
FRANCE

  +33 1 80 49 43 14
  benoit.sagot.at.inria.fr