Word image based latent semantic indexing for conceptual querying in document image databases
Author(s) -
Sameek Banerjee,
Gaurav Harit,
Santanu Chaudhury
Publication year - 2007
Publication title -
ninth international conference on document analysis and recognition (icdar 2007)
Language(s) - English
DOI - 10.1109/icdar.2007.269
In this paper we present an application of latent semantic analysis (LSA) for indexing and retrieval of document images with text. The query is specified as a set of word images and the documents which best match with the query representation in the the latent semantic space are retrieved. We show through extensive experiments on a large database that use of LSA for document images provides improvements in retrieval precision as is the case with electronic text documents.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom