z-logo
open-access-imgOpen Access
Heaps’ Law and Heaps functions in tagged texts: evidences of their linguistic relevance
Author(s) -
Andrés Chacoma,
Damián H. Zanette
Publication year - 2020
Publication title -
royal society open science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.84
H-Index - 51
ISSN - 2054-5703
DOI - 10.1098/rsos.200008
Subject(s) - noun , vocabulary , relevance (law) , linguistics , feature (linguistics) , computer science , point (geometry) , relation (database) , natural language processing , mathematics , artificial intelligence , law , philosophy , political science , geometry , database
We study the relationship between vocabulary size and text length in a corpus of 75 literary works in English, authored by six writers, distinguishing between the contributions of three grammatical classes (or ‘tags,’ namely, nouns , verbs and others ), and analyse the progressive appearance of new words of each tag along each individual text. We find that, as prescribed by Heaps’ Law, vocabulary sizes and text lengths follow a well-defined power-law relation. Meanwhile, the appearance of new words in each text does not obey a power law, and is on the whole well described by the average of random shufflings of the text. Deviations from this average, however, are statistically significant and show systematic trends across the corpus. Specifically, we find that the appearance of new words along each text is predominantly retarded with respect to the average of random shufflings. Moreover, different tags add systematically distinct contributions to this tendency, with verbs and others being respectively more and less retarded than the mean trend, and nouns following instead the overall mean. These statistical systematicities are likely to point to the existence of linguistically relevant information stored in the different variants of Heaps’ Law, a feature that is still in need of extensive assessment.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom