An Effective Pattern Pruning and Summarization Method Retaining High Quality Patterns With High Area Coverage in Relational Datasets
Author(s) -
Pei-Yuan Zhou,
Gary C. L. Li,
Andrew K. C. Wong
Publication year - 2016
Publication title -
ieee access
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.587
H-Index - 127
ISSN - 2169-3536
DOI - 10.1109/access.2016.2624418
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Pattern mining has been widely used to uncover interesting patterns from data. However, one of its main problems is that it produces too many patterns and many of them are redundant. To reduce the number of redundant patterns and retain overlapping ones, delta-closed pattern pruning was introduced, yet it can only prune subpatterns if they are covered by superpatterns. Such unduly superpatterns need to be pruned. Furthermore, in order to improve the management and interpretation of patterns, pattern summarization is proposed. It renders a small number of patterns that retain the most crucial information. RuleCover algorithm was one of such algorithms. However, it tends to produce over trivial patterns, whereas more interesting and revealing ones may be pruned. To overcome these problems, this paper presents a new algorithm which integrates delta-closed, and RuleCover methods with our other two new algorithms: 1) statistically induced pattern pruning for pruning statistically induced superpatterns by strong subpatterns and 2) AreaCover algorithm for pruning overlapping patterns but retain higher order and high quality patterns with large coverage of the data “area.” Experimental results show that the proposed algorithms produce very compact yet comprehensive knowledge from patterns discovered from relational data sets.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom