Arabic Anaphora Resolution: Corpus of the Holy Qur’an Annotated with Anaphoric Information | Zendy

M. Khadiga | Zendy; Ali Farghaly | Zendy; Aly Aly | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Arabic Anaphora Resolution: Corpus of the Holy Qur’an Annotated with Anaphoric Information

Author(s) -

M. Khadiga,

Ali Farghaly,

Aly Aly

Publication year - 2015

Publication title -

international journal of computer applications

Language(s) - English

Resource type - Journals

ISSN - 0975-8887

DOI - 10.5120/ijca2015905709

Subject(s) - anaphora (linguistics) , computer science , arabic , natural language processing , resolution (logic) , artificial intelligence , linguistics , information retrieval , philosophy

This paper reports on compiling a large Arabic corpus of the Holy Qur'an script, annotated with anaphoric relation and other anaphoric information, providing multi-dimensional feature vector rich with most of basic anaphoric information needed in statistical anaphora resolution systems. About 24,653 personal pronouns are tagged with their antecedents and other anaphoric information like distance between the anaphor and its antecedent in terms of verses, words, and segments, gender, number, person, and other information which can be used to implement the feature vector of a statistical anaphora resolution system. In addition, it describes the compilation of a bank of sentence patterns consisting of 481 antecedent patterns; each pattern represents particular part-of-speech tag corresponding to its antecedent phrase. The aim is to provide a valuable resource that enables future research in Arabic anaphora resolution, and help in future work in analyzing Quran script. Also, it will be a valuable resource that can be used for training and testing anaphora resolution systems, and evaluating. General Terms Natural language processing, Computational linguistics, Anaphora resolution, Corpus development.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research