
Efficient Algorithm for Auto Correction Using n-gram Indexing
Author(s) -
Mahesh Lalwani,
Nitesh Bagmar,
Saurin Parikh
Publication year - 2013
Publication title -
international journal of computer and communication technology
Language(s) - English
Resource type - Journals
eISSN - 2231-0371
pISSN - 0975-7449
DOI - 10.47893/ijcct.2013.1177
Subject(s) - search engine indexing , computer science , heuristics , string (physics) , string searching algorithm , approximate string matching , edit distance , hash function , filter (signal processing) , algorithm , locality sensitive hashing , focus (optics) , n gram , commentz walter algorithm , pattern matching , artificial intelligence , hash table , language model , programming language , mathematics , physics , optics , mathematical physics , computer vision , operating system
Auto correction functionality is very popular in search portals. Its principal purpose is to correct common spelling or typing errors, saving time for the user. However, when there are millions of strings in a dictionary, it takes considerable amount of time to find the nearest matching string. Various approaches have been proposed for efficiently implementing auto correction functionality. All of these approaches focus on using suitable data structure and few heuristics to solve the problems. Here, we propose a new idea which eliminates the need for calculating edit distance with each string in the dictionary. It uses the concept of Ngram based indexing and hashing to filter out irrelevant strings from dictionary. Experiments suggest that proposed algorithm provides both efficient and accurate results.