Analog Document Search Using CRNN and Keyphrase Extraction
Author(s) -
S Lokeshwar,
Vadiraja Rao M. K,
Sujay Kumar P. S,
Vishveshwara Guthal Gowda,
P. Hemavathi
Publication year - 2021
Publication title -
international journal of image graphics and signal processing
Language(s) - English
Resource type - Journals
eISSN - 2074-9082
pISSN - 2074-9074
DOI - 10.5815/ijigsp.2021.02.02
Subject(s) - computer science , key (lock) , information retrieval , rank (graph theory) , graph , optical character recognition , process (computing) , artificial intelligence , natural language processing , image (mathematics) , theoretical computer science , programming language , computer security , combinatorics , mathematics
There seems to be a peculiar trend in the way information is now used, moving to digital media not just for the newspapers but for books as well. With advances in Optical Character Recognition (OCR), Style Transfer Mapping (STM), and efficient key phrasing, we are now able to digitalize the document to a form that can be read across multiple platforms and searched efficiently. It provides users with the ease of searching for relevant documents without the tedious process of manual searching. We propose a system that uses the CRNN model to detect English characters in the document with high accuracy. We then pair it with a hybrid keyphrasing technique, which uses Positional Rank as its Graph based rank and re-rank the key phrases using the C-Value method. This process allows us to automatically digitize the printed document and summarise it to provide high-quality keyphrases, which can be used to efficiently search and retrieve relevant printed documents.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom