FastSK: fast sequence analysis with gapped string kernels
Author(s) -
Derrick Blakely,
Eamon Collins,
Ritambhara Singh,
A. J. Norton,
Jack Lanchantin,
Yanjun Qi
Publication year - 2020
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btaa817
Subject(s) - computer science , string kernel , algorithm , kernel (algebra) , support vector machine , scalability , computation , source code , classifier (uml) , artificial intelligence , kernel method , mathematics , polynomial kernel , combinatorics , database , operating system
Gapped k-mer kernels with support vector machines (gkm-SVMs) have achieved strong predictive performance on regulatory DNA sequences on modestly sized training sets. However, existing gkm-SVM algorithms suffer from slow kernel computation time, as they depend exponentially on the sub-sequence feature length, number of mismatch positions, and the task's alphabet size.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom