z-logo
open-access-imgOpen Access
Learning string similarity measures for gene/protein name dictionary look-up using logistic regression
Author(s) -
Yoshimasa Tsuruoka,
John McNaught,
Jun'i chi Tsujii,
Sophia Ananiadou
Publication year - 2007
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btm393
Subject(s) - similarity (geometry) , string (physics) , matching (statistics) , computer science , artificial intelligence , measure (data warehouse) , string searching algorithm , string metric , similarity measure , logistic regression , data mining , pattern recognition (psychology) , natural language processing , machine learning , pattern matching , mathematics , statistics , mathematical physics , image (mathematics)
One of the bottlenecks of biomedical data integration is variation of terms. Exact string matching often fails to associate a name with its biological concept, i.e. ID or accession number in the database, due to seemingly small differences of names. Soft string matching potentially enables us to find the relevant ID by considering the similarity between the names. However, the accuracy of soft matching highly depends on the similarity measure employed.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom