z-logo
open-access-imgOpen Access
Word-based largest chunks for Agreement Groups processing: Cross-linguistic observations
Author(s) -
László Drienkó
Publication year - 2020
Publication title -
lingbaw
Language(s) - English
Resource type - Journals
ISSN - 2450-5188
DOI - 10.31743/lingbaw.11831
Subject(s) - text segmentation , computer science , natural language processing , word (group theory) , agreement , utterance , linguistics , segmentation , artificial intelligence , sequence (biology) , speech recognition , philosophy , biology , genetics
The present study reports results from a series of computer experiments seeking to combine word-based Largest Chunk (LCh) segmentation and Agreement Groups (AG) sequence processing. The AG model is based on groups of similar utterances that enable combinatorial mapping of novel utterances. LCh segmentation is concerned with cognitive text segmentation, i.e. with detecting word boundaries in a sequence of linguistic symbols. Our observations are based on the text of Le petit prince (The little prince) by Antoine de Saint-Exupéry in three languages: French, English, and Hungarian. The data suggest that word-based LCh segmentation is not very efficient with respect to utterance boundaries, however, it can provide useful word combinations for AG processing. Typological differences between the languages are also reflected in the results.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here