Light-weight xPath processing of XML stream with deterministic automata
Author(s) -
Makoto Onizuka
Publication year - 2003
Publication title -
citeseer x (the pennsylvania state university)
Language(s) - English
Resource type - Conference proceedings
ISBN - 1-58113-723-0
DOI - 10.1145/956863.956928
Subject(s) - xpath , computer science , deterministic finite automaton , automaton , regular expression , theoretical computer science , xml , path expression , stream processing , finite state machine , streaming xml , programming language , state (computer science) , parallel computing , xml database , operating system
Several applications based on XML stream processing have recently emerged, such as those for air traffic control and the selective dissemination of information (SDI). Their common need is to process a large number of XPath expressions in continuous XML streams at high throughput.This paper proposes four techniques for XPath expression processing based on Deterministic Finite Automata (DFA) for two purposes: to improve the memory usage efficiency of the automata and to support the processing of branching XPath expressions. The first technique, called n-DFA, clusters the given XPath expressions into n clusters to reduce the number of DFA states. The second, called shared NFA state table, lets the Non-Deterministic Finite Automata (NFA) state set be shared among the DFA states. Our experiments show that memory usage in an 8-DFA can, with the shared NFA state table, be reduced to 1/40th that of the original 1-DFA. The optimized NFA conversion and general XPath expression processing algorithm techniques contribute to the processing of branching XPath expressions efficiently; overall performance is better than is possible with earlier approaches.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom