z-logo
open-access-imgOpen Access
Discovering XSD Keys from XML Data
Author(s) -
Marcelo Arenas,
Jonny Daenen,
Frank Neven,
Martín Ugarte,
Jan Van den Bussche,
Stijn Vansummeren
Publication year - 2014
Publication title -
acm transactions on database systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.988
H-Index - 84
eISSN - 1557-4644
pISSN - 0362-5915
DOI - 10.1145/2638547
Subject(s) - computer science , xml validation , document structure description , xpath , efficient xml interchange , xml encryption , xml schema editor , xml schema (w3c) , xml database , information retrieval , streaming xml , xml , world wide web
A great deal of research into the learning of schemas from XML data has been conducted in recent years to enable the automatic discovery of XML Schemas from XML documents when no schema, or only a low-quality one is available. Unfortunately, and in strong contrast to, for instance, the relational model, the automatic discovery of even the simplest of XML constraints, namely XML keys, has been left largely unexplored in this context. A major obstacle here is the unavailability of a theory on reasoning about XML keys in the presence of XML schemas, which is needed to validate the quality of candidate keys. The present paper embarks on a fundamental study of such a theory and classifies the complexity of several crucial properties concerning XML keys in the presence of an XSD, like, for instance, testing for consistency, boundedness, satisfiability, universality, and equivalence. Of independent interest, novel results are obtained related to cardinality estimation of XPath result sets. A mining algorithm is then developed within the framework of levelwise search. The algorithm leverages known discovery algorithms for functional dependencies in the relational model, but incorporates the above mentioned properties to assess and refine the quality of derived keys. An experimental study on an extensive body of real world XML data evaluating the effectiveness of the proposed algorithm is provided.SCOPUS: cp.jinfo:eu-repo/semantics/publishe

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom