A new modeling method in feature construction for the HSQC spectra screening problem
Author(s) -
Hiromi Arai,
Satoru Watanabe,
T. Kigawa,
Masayuki Yamamura
Publication year - 2008
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btn345
Subject(s) - overfitting , heteronuclear single quantum coherence spectroscopy , computer science , feature (linguistics) , artificial intelligence , machine learning , pattern recognition (psychology) , data mining , chemistry , artificial neural network , two dimensional nuclear magnetic resonance spectroscopy , linguistics , philosophy , stereochemistry
Large-scale biological analyses produce huge amounts of data. As a consequence, automation in the data analysis process is needed. Sample screening problems in NMR high-throughput protein structure analysis are the typical examples. Especially, screening by protein (1)H-(15)N heteronuclear single quantum coherence (HSQC) spectra must be done quantitatively by a human expert. One popular solution for this problem is data mining. Machine learning methods can automatically extract rules and achieve high accuracy in prediction when a good quality training dataset is prepared. However, they tend to be a black box and the learned machines suffer the risk of overfitting to the dataset.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom