Relevant XML Documents - Approach Based on Vectors and Weight Calculation of Terms | Zendy

Abdeslem Dennai | Zendy; Mohammed Yacine DENNAI | Zendy; Sidi Mohamed Benslimane | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Relevant XML Documents - Approach Based on Vectors and Weight Calculation of Terms

Author(s) -

Abdeslem Dennai,

Mohammed Yacine DENNAI,

Sidi Mohamed Benslimane

Publication year - 2016

Publication title -

international journal of information technology and computer science

Language(s) - English

Resource type - Journals

eISSN - 2074-9015

pISSN - 2074-9007

DOI - 10.5815/ijitcs.2016.11.03

Subject(s) - computer science , information retrieval , xml , relevance (law) , document structure description , set (abstract data type) , representation (politics) , tf–idf , exploit , term (time) , world wide web , programming language , physics , computer security , quantum mechanics , politics , political science , law

Three classes of documents, based on their data, circulate in the web: Unstructured documents (.Doc, .html, .pdf ...), semi-structured documents (.xml, .Owl ...) and structured documents (Tables database for example). A semi-structured document is organized around predefined tags or defined by its author. However, many studies use a document classification by taking into account their textual content and underestimate their structure. We attempt in this paper to propose a representation of these semi-structured web documents based on weighted vectors allowing exploit ing their content for a possible treatment. The weight of terms is calculated using: The normal frequency for a document, TF-IDF (Term Frequency Inverse Document Frequency) and logic (Boolean) frequency for a set of documents. To assess and demonstrate the relevance of our proposed approach, we will realize several experiments on different corpus.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research