Text mining and protein annotations: the construction and use of protein description sentences. | Zendy

Martin  Krallinger | Zendy; Rainer  Malik | Zendy; Alfonso  Valencia | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Text mining and protein annotations: the construction and use of protein description sentences.

Author(s) -

Martin Krallinger,

Rainer Malik,

Alfonso Valencia

Publication year - 2006

Publication title -

genome informatics. international conference on genome informatics

Language(s) - English

DOI - 10.11234/gi1990.17.2_121

Existing biological knowledge stored as structured database records has been extracted manually by database curators analyzing the scientific literature. Most of this information was derived from sentences which describe biologically relevant aspects of genes and gene products. We introduce the Protein description sentence (Prodisen) corpus, a useful resource for the automatic identification and construction of text-based protein and gene description records using information extraction and text classification techniques. Basic guidelines and criteria relevant for the construction of a text corpus of functional descriptions of genes and proteins are proposed. The steps used for the corpus construction and its features are presented. Moreover, some of the potential applications of the Prodisen corpus for biomedical text mining purposes are explored and the obtained results are presented.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research