z-logo
Premium
Fast approximate string matching
Author(s) -
Owolabi O.,
McGregor D. R.
Publication year - 1988
Publication title -
software: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.437
H-Index - 70
eISSN - 1097-024X
pISSN - 0038-0644
DOI - 10.1002/spe.4380180407
Subject(s) - string metric , approximate string matching , string searching algorithm , string (physics) , levenshtein distance , computer science , metric (unit) , matching (statistics) , set (abstract data type) , commentz walter algorithm , algorithm , measure (data warehouse) , similarity (geometry) , process (computing) , string kernel , theoretical computer science , pattern matching , mathematics , data mining , artificial intelligence , programming language , kernel method , operations management , statistics , image (mathematics) , polynomial kernel , support vector machine , economics , mathematical physics
Approximate string matching is an important operation in information systems because an input string is often an inexact match to the strings already stored. Commonly known accurate methods are computationally expensive as they compare the input string to every entry in the stored dictionary. This paper describes a two‐stage process. The first uses a very compact n ‐gram table to preselect sets of roughly similar strings. The second stage compares these with the input string using an accurate method to give an accurately matched set of strings. A new similarity measure based on the Levenshtein metric is defined for this comparison. The resulting method is both computationally fast and storage‐efficient.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here