Bringing Text Miners and Biologists Closer Together
Author(s) -
Anália Lourenço,
Sónia Carneiro,
Rafael Carreira,
Miguel Rocha,
Isabel Rocha,
E. C. Ferreira
Publication year - 2009
Publication title -
nature precedings
Language(s) - English
Resource type - Journals
ISSN - 1756-0357
DOI - 10.1038/npre.2009.3188.1
Subject(s) - computer science , workbench , bridging (networking) , software , biomedical text mining , data science , information extraction , world wide web , software engineering , information retrieval , artificial intelligence , data mining , visualization , text mining , programming language , computer network
The boosting of Biomedical Text Mining (BioTM) research in the last few years has led the way for finally bridging out the gap between text miners and biologists. Beyond the development of enhanced entity recognisers and the construction of relationship extraction systems, now, more than ever, it is the time for applying available tools to real-world scenarios. Moreover, it is crucial to develop end-user tools that can assist biologists in their research activities. Such tools should be able to emulate biologist conventional curation, recurring to the same knowledge bases and making the same assumptions that biologists usually do, whereas delivering automated capabilities. The search and selection of PubMed articles, the construction of dictionaries from the contents of available Molecular Biology repositories, the implementation of description environments for rule specification, the implementation of dictionary- and rule-based entity recognisers, the development of flexible and extensible relationship extraction systems and the development of easy-to-use manual curation environments are of foremost importance.
Our software, named @Note, aims to be a framework and a workbench for BioTM, i.e., it has been conceived for delivering end-user applications, whereas enabling collaboration with other BioTM groups. As a framework, it provides a reusable design for BioTM software systems and a set of pre-assembled software building blocks that programmers can use, extend and customise for their specific needs. As a workbench, it helps developing BioTM applications by integrating Natural Language Processing and Data Mining tools and supporting major Information Retrieval and Information Extraction processes. Moreover, it encompasses a flexible and extensible manual curation environment that enables the interaction with biologists, correcting former annotations and enhancing dictionary contents. We successfully applied @Note in the study of the stringent response on _Escherichia coli_, an important subject within the analysis of stress responses in bacteria. This joint effort allowed biologists to contribute to the enhancement of our manual curation environment and to identify new functionalities for the existing plug-ins and the specification of new plug-ins
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom