Statistical Deep Parsing for Spanish: Abridged Version
Author(s) -
Luis Chiruzzo
Publication year - 2022
Publication title -
clei electronic journal
Language(s) - English
Resource type - Journals
ISSN - 0717-5000
DOI - 10.19153/cleiej.25.1.2
Subject(s) - parsing , computer science , natural language processing , head driven phrase structure grammar , artificial intelligence , formalism (music) , grammar , rule based machine translation , dependency (uml) , linguistics , generative grammar , art , musical , philosophy , visual arts
This document presents the development of a statistical HPSG parser for Spanish. HPSG is a deep linguistic formalism that combines syntactic and semantic information in the same representation, and is capable of elegantly modeling many linguistic phenomena. We describe the HPSG grammar adapted to Spanish we designed and the construction of our corpus. Then we present the different parsing algorithms we implemented for our corpus and grammar: a bottom-up strategy, a CKY with supertagger approach, and a LSTM top-down approach. We then show the experimental results obtained by our parsers compared among themselves and also to other external Spanish parsers for some global metrics and for some particular phenomena we wanted to test. The LSTM top-down approach was the strategy that obtained the best results on most of the metrics (for our parsers and external parsers as well), including constituency metrics (87.57 unlabeled F1, 82.06 labeled F1), dependency metrics (91.32 UAS, 88.96 LAS), and SRL (87.68 unlabeled, 80.66 labeled), and most of the particular phenomenon metrics such as clitics reduplication, relative referents detection and coordination chain identification.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom