Marginalized kernels for RNA sequence data analysis.
Author(s) -
Taishin Kin,
Koji Tsuda,
Kiyoshi Asai
Publication year - 2002
Publication title -
genome informatics. international conference on genome informatics
Language(s) - English
DOI - 10.11234/gi1990.13.112
We present novel kernels that measure similarity of two RNA sequences, taking account of their secondary structures. Two types of kernels are presented. One is for RNA sequences with known secondary structures, the other for those without known secondary structures. The latter employs stochastic context-free grammar (SCFG) for estimating the secondary structure. We call the latter the marginalized count kernel (MCK). We show computational experiments for MCK using 74 sets of human tRNA sequence data: (i) kernel principal component analysis (PCA) for visualizing tRNA similarities, (ii) supervised classification with support vector machines (SVMs). Both types of experiment show promising results for MCKs.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom