Practical Approaches for Mining Frequent Patterns in Molecular Datasets
Author(s) -
Stefan Naulaerts,
Sandy Moens,
Kristof Engelen,
Wim Vanden Berghe,
Bart Goethals,
Kris Laukens,
Pieter Meysman
Publication year - 2016
Publication title -
bioinformatics and biology insights
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.556
H-Index - 23
ISSN - 1177-9322
DOI - 10.4137/bbi.s38419
Subject(s) - computer science , popularity , data science , implementation , task (project management) , data mining , software , biological data , value (mathematics) , machine learning , bioinformatics , software engineering , engineering , psychology , social psychology , systems engineering , biology , programming language
Pattern detection is an inherent task in the analysis and interpretation of complex and continuously accumulating biological data. Numerous itemset mining algorithms have been developed in the last decade to efficiently detect specific pattern classes in data. Although many of these have proven their value for addressing bioinformatics problems, several factors still slow down promising algorithms from gaining popularity in the life science community. Many of these issues stem from the low user-friendliness of these tools and the complexity of their output, which is often large, static, and consequently hard to interpret. Here, we apply three software implementations on common bioinformatics problems and illustrate some of the advantages and disadvantages of each, as well as inherent pitfalls of biological data mining. Frequent itemset mining exists in many different flavors, and users should decide their software choice based on their research question, programming proficiency, and added value of extra features.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom