Word-based largest chunks for Agreement Groups processing: Cross-linguistic observations | Zendy

László Drienkó | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Word-based largest chunks for Agreement Groups processing: Cross-linguistic observations

Author(s) -

László Drienkó

Publication year - 2020

Publication title -

linguistics beyond and within (lingbaw)

Language(s) - English

Resource type - Journals

ISSN - 2450-5188

DOI - 10.31743/lingbaw.11831

Subject(s) - text segmentation , computer science , natural language processing , word (group theory) , agreement , utterance , linguistics , segmentation , artificial intelligence , sequence (biology) , speech recognition , philosophy , biology , genetics

The present study reports results from a series of computer experiments seeking to combine word-based Largest Chunk (LCh) segmentation and Agreement Groups (AG) sequence processing. The AG model is based on groups of similar utterances that enable combinatorial mapping of novel utterances. LCh segmentation is concerned with cognitive text segmentation, i.e. with detecting word boundaries in a sequence of linguistic symbols. Our observations are based on the text of Le petit prince (The little prince) by Antoine de Saint-Exupéry in three languages: French, English, and Hungarian. The data suggest that word-based LCh segmentation is not very efficient with respect to utterance boundaries, however, it can provide useful word combinations for AG processing. Typological differences between the languages are also reflected in the results.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research