
Development of Saraiki WordNet by Mapping of Word Senses: A Corpus-based Approach
Author(s) -
Sarah Gul,
Musarrat Azher,
Sawaz
Publication year - 2021
Publication title -
linguistics and literature review
Language(s) - English
Resource type - Journals
eISSN - 2409-109X
pISSN - 2221-6510
DOI - 10.32350/llr.72/04
Subject(s) - wordnet , urdu , computer science , newspaper , natural language processing , linguistics , artificial intelligence , philosophy , sociology , media studies
This paper aimed to develop the Saraiki WordNet. Saraiki is one of the regional languages spoken in Pakistan and has a unique history of its own. Saraiki language is remarkably similar to two languages, namely Punjabi and Sindhi. Saraiki has different dialects and each dialect is representative of the region where it is spoken. This paper used the Urdu WordNet (Zafar, Mahmood, Shams & Hussain, 2014) as the basis for the formation of Saraiki WordNet. Urdu WordNet (Zafar et al., 2014) was created by UET Lahore and is based on Princeton WordNet (Miller, 1990). Dictionaries or lughats and literary sources, such as poetry, fiction, as well as non-literary sources, such as newspapers of Saraiki language, were used to extract data. Additionally, Urdu word senses were mapped onto Saraiki word senses. The method used in this study was mapping, while the expansion approach was used in the mapping process. This study may aid in creating bilingual dictionaries (of Saraiki and Urdu?) in the future.
Keywords: expand approach, mapping, Saraiki language, WordNet