Binding site discovery from nucleic acid sequences by discriminative learning of hidden Markov models | Zendy

Jonas Maaskola | Zendy; Nikolaus Rajewsky | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Binding site discovery from nucleic acid sequences by discriminative learning of hidden Markov models

Author(s) -

Jonas Maaskola,

Nikolaus Rajewsky

Publication year - 2014

Publication title -

nucleic acids research

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 9.008

H-Index - 537

eISSN - 1362-4954

pISSN - 0305-1048

DOI - 10.1093/nar/gku1083

Subject(s) - biology , discriminative model , computational biology , nucleic acid , hidden markov model , genetics , artificial intelligence , computer science

We present a discriminative learning method for pattern discovery of binding sites in nucleic acid sequences based on hidden Markov models. Sets of positive and negative example sequences are mined for sequence motifs whose occurrence frequency varies between the sets. The method offers several objective functions, but we concentrate on mutual information of condition and motif occurrence. We perform a systematic comparison of our method and numerous published motif-finding tools. Our method achieves the highest motif discovery performance, while being faster than most published methods. We present case studies of data from various technologies, including ChIP-Seq, RIP-Chip and PAR-CLIP, of embryonic stem cell transcription factors and of RNA-binding proteins, demonstrating practicality and utility of the method. For the alternative splicing factor RBM10, our analysis finds motifs known to be splicing-relevant. The motif discovery method is implemented in the free software package Discrover. It is applicable to genome- and transcriptome-scale data, makes use of available repeat experiments and aside from binary contrasts also more complex data configurations can be utilized.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research