Zipf’s and Heaps’ laws for the natural and some related random texts
Author(s) -
О. С. Кушнір,
Volodymyr Buryi,
Serhiy Grydzhan,
Lubomyr Ivanitskyi,
Serhiy Rykhlyuk
Publication year - 2018
Publication title -
electronics and information technologies
Language(s) - English
Resource type - Journals
eISSN - 2224-0888
pISSN - 2224-087X
DOI - 10.30970/eli.9.94
Subject(s) - zipf's law , natural (archaeology) , law , mathematics , geography , statistics , political science , archaeology
We have generated randomized Chomsky’s texts and Miller’s monkey random texts (RTs), basing on a source natural text (NT), and clarified their rank–frequency dependences, Pareto distributions, word-frequency probability distributions, and vocabularies as functions of text lengths. Here the Chomsky’s RT is a NT randomized so that its ‘words’ represent any sequences of letters and blanks between the nearest occurrences of some preset letter (e.g., the letter i). We have compared the exponents appearing in different power laws that describe the word statistics for the NTs and RTs, and have analyzed how well theoretical relationships among those exponents are fulfilled in practice. We have proven empirically that the exponents α and β of the Zipf’s law and the word probability distribution for the Chomsky’s RTs are limited by the inequalities α < 1 and β > 1, while their Heaps’ exponent should be equal to η ≈ 1. We have also compared our results to those obtained for the monkey texts. We have shown that the vocabulary of the Chomsky’s texts is richer than that of the monkey texts. The Heaps’ law is valid to extraordinarily good approximation for the Chomsky’s RTs, similarly to the RTs generated by the intermittence silence process and unlike to sufficiently long NTs that reveal slightly convex vocabulary versus text length dependences plotted on the double logarithmic scale.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom