z-logo
open-access-imgOpen Access
An Optimistic Approach for Clustering Multi-version XML Documents Using Compressed Delta
Author(s) -
Vijay R. Sonawane,
D. Srinivasa Rao
Publication year - 2015
Publication title -
international journal of electrical and computer engineering
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.277
H-Index - 22
ISSN - 2088-8708
DOI - 10.11591/ijece.v5i6.pp1472-1479
Subject(s) - computer science , efficient xml interchange , xml validation , streaming xml , document structure description , xml schema editor , xml encryption , xml database , information retrieval , xml framework , xml signature , xml , xml schema (w3c) , database , cluster analysis , simple api for xml , data mining , world wide web , artificial intelligence
Today with Standardization of XML as an information exchange over web, huge amount of information is formatted in the XML document. XML documents are huge in size. The amount of information that has to be transmitted, processed, stored, and queried is often larger than that of other data formats. Also in real world applications XML documents are dynamic in nature. The versatile applicability of XML documents in different fields of information maintenance and management is increasing the demand to store different versions of XML documents with time. However, storage of all versions of an XML document may introduce the redundancy. Self describing nature of XML creates the problem of verbosity, in result documents are in huge size. This paper proposes optimistic approach to Re-cluster multi-version XML documents which change in time by reassessing distance between them by using knowledge from initial clustering solution and changes stored in compressed delta. Evolving size of XML document is reduced by applying homomorphic compression before clustering them which retains its original structure. Compressed delta stores the changes responsible for document versions, without decompressing them. Test results shows that our approach performs much better than using full pair-wise document comparison.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here