Premium
Identification of transcription factor binding sites using G aussian mixture models
Author(s) -
Karabulut Mustafa,
Ibrikci Turgay
Publication year - 2014
Publication title -
expert systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.365
H-Index - 38
eISSN - 1468-0394
pISSN - 0266-4720
DOI - 10.1111/exsy.12004
Subject(s) - computer science , dna binding site , motif (music) , maximization , identification (biology) , fuzzy logic , expectation–maximization algorithm , data mining , artificial intelligence , transcription factor , computational biology , pattern recognition (psychology) , maximum likelihood , genetics , biology , mathematical optimization , mathematics , promoter , gene , statistics , gene expression , physics , botany , acoustics
Identification of transcription factor binding sites still remains a challenging problem even though many computational tools have been proposed in the literature for this specific task. In this study, a method to discover such DNA subsequences, that is, motifs, is proposed. The method uses G aussian mixture models with expectation‐maximization algorithm. In order to show the potential of the proposed method, experiments are conducted by use of data sets extracted from the DNA sequences of various organisms. The proposed method is also compared with four other methods: MEME , MDS can, SOMBRERO and the fuzzy C ‐means based motif finder. As a result, the proposed method proves itself as a promising tool in identifying over‐represented DNA motifs.