z-logo
open-access-imgOpen Access
NETASPNO: Approximate Strict Pattern Matching Under Nonoverlapping Condition
Author(s) -
Youxi Wu,
Shasha Li,
Jingyu Liu,
Lei Guo,
Xindong Wu
Publication year - 2018
Publication title -
ieee access
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.587
H-Index - 127
ISSN - 2169-3536
DOI - 10.1109/access.2018.2832209
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
In pattern matching, a gap constraint is a more flexible wildcard than traditional wildcards “?” and “*”. Pattern matching with gap constraints is more difficult to handle and fulfills user’s enquiries more easily. Pattern matching with gap constraints has therefore been carried out in numerous research works, such as music information retrieval, searching protein sites, and sequence pattern mining. Strict pattern matching under a nonoverlapping condition, as a type of pattern matching with gap constraints, is a key issue of sequence pattern mining with gap constraints since it can be used to compute the frequency of a pattern. Exact matching limits the flexibility of the match to some extent since it requires each character to be matched exactly. We therefore address approximate strict pattern matching under the nonoverlapping constraints (ASPNO) and propose an effective algorithm, named NETtree for ASPNO (NETASPNO), which first transforms the problem into a Nettree data structure, an extensive tree structure. To find the nonoverlapping occurrences effectively, we propose the concept of number of roots paths with distance constraints (NRPDC) which indicates the number of path from a node to the roots with distance $d$ and can be used to delete useless parent-child relationships and useless nodes. We iteratively recalculate the NRPDCs of each node on the subnettree with the rightmost root. Then we can get a path from the rightmost leaf to its rightmost root without using the backtracking strategy. NETASPNO therefore iteratively gets the rightmost root-leaf-path and prunes the path on the Nettree. Extensive experimental results demonstrate that NETASPNO has better performance than the other competitive algorithms.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom