z-logo
open-access-imgOpen Access
Metric to determine language complexity using dictionary Method percentage retrieval
Author(s) -
Dr Devasish Pal,
Dr N V Ganapathi Raju,
Mr Gautam Pal
Publication year - 2019
Publication title -
international journal of innovative technology and exploring engineering
Language(s) - English
Resource type - Journals
ISSN - 2278-3075
DOI - 10.35940/ijitee.i8223.078919
Subject(s) - computer science , unicode , ascii , natural language processing , encryption , sample (material) , artificial intelligence , metric (unit) , information retrieval , programming language , operating system , chemistry , operations management , chromatography , economics
For communication through computer network, previously only English language using ASCII mode was used. Subsequently when Unicode was introduced, computer communication became a possibility for all language texts. This aspect generated interest in the field of language processing. Various studies have been carried out on language processing and its complexity issues. Various metrics were used to determine language complexity such as lexical density, morphological density, semantics etc. but there was no consistency in results. A language which appears most complex using one metric does not appear the same using other metric. This paper introduces a new metric to determine the complexity of a language which is consistent and with proven results. It introduces the concept of network security where using dictionary method, the percentage retrieval of an encrypted text is calculated using an encryption algorithm, fixed length key, fixed corpus size etc. Lesser is the percentage retrieval, greater is the security and language complexity. Comparison has been made with the results on language complexity independently carried out on various Indian languages by the research scholars of Central University, Hyderabad based on Morphological and lexical density. Pattern observed on their eight Indian languages by the research scholars of Central University and the percentage retrieval on the same Indian languages in my work are identical which proves my work. Hence it can be concluded that lesser is the percentage retrieval, security increases for the sample text data considered and proportionately the complexity of that particular language increases Sample data encryption has been carried out using substitution method

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here