A Lexical Database of Portuguese Multiword Expressions
Author(s) -
Sandra Antunes,
María Fernanda Bacelar do Nascimento,
João Miguel Casteleiro,
Amália Mendes,
Luísa Pereira,
Tiago da Mota Veiga Moreira de Sá
Publication year - 2006
Publication title -
lecture notes in computer science
Language(s) - English
Resource type - Book series
SCImago Journal Rank - 0.249
H-Index - 400
eISSN - 1611-3349
pISSN - 0302-9743
ISBN - 3-540-34045-9
DOI - 10.1007/11751984_30
Subject(s) - computer science , lexical database , natural language processing , portuguese , artificial intelligence , word (group theory) , lexical item , brazilian portuguese , lexicon , database , information retrieval , linguistics , wordnet , philosophy
This presentation focuses on an ongoing project which aims at the creation of a large lexical database of Portuguese multiword (MW) units, automatically extracted through the analysis of a balanced 50 million word corpus, statistically interpreted with lexical association measures and validated by hand. This database covers different types of MW units, like named entities, and lexical associations ranging from sets of favoured co-occurring forms to strongly lexicalized expressions. This new resource has a two-fold objective: to be an important research tool which supports the development of MW units typologies; to be of major help in developing and evaluating language processing tools able of dealing with MW expressions.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom