GeneNarrator: Mining the Literaturome for Relations Among Genes
Author(s) -
Jing Ding,
Daniel Berleant,
Jun Xu,
Kenton D. Juhlin,
Eve Syrkin Wurtele,
Andy Fulmer
Publication year - 2009
Publication title -
journal of proteomics and bioinformatics
Language(s) - English
Resource type - Journals
ISSN - 0974-276X
DOI - 10.4172/jpb.1000096
Subject(s) - computational biology , gene , biology , computer science , data science , data mining , genetics
The rapid development of microarray and other genom ic technologies now enables biologists to monitor t he expression of hundreds, even thousands of genes in a single experiment. Interpreting the biological m eaning of the expression patterns still relies largely on biologist's domain knowledge, as well as on information collected from the literature and various public databases. Yet i ndividual experts’ domain knowledge is insufficient for large data sets, and collecting and analyzing this information manually from the literature and/or public databases is tedious and time-consuming. Computer-aided functional analy sis tools are therefore highly desirable. We describe the architecture of GeneNarrator, a tex t mining system for functional analysis of microarr ay data. This system’s primary purpose is to test the feasib ility of a more general system architecture based o n a two-stage clustering strategy that is explained in detail. Gi ven a list of genes, GeneNarrator collects abstract s about them from PubMed, then clusters the abstracts into funct ional topics in a first clustering stage. In the s econd clustering stage, the genes are clustered into groups based on similarities in their distributions of occurrence across topics. This novel two-stage architecture, the primary cont ribution of this project, has benefits not easily p rovided by one- stage clustering.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom