
A Detailed Study of Distributed Indexed Search Techniques using SOLR
Author(s) -
B Jagadeeswar
Publication year - 2020
Publication title -
international journal of engineering and advanced technology
Language(s) - English
Resource type - Journals
ISSN - 2249-8958
DOI - 10.35940/ijeat.f1369.089620
Subject(s) - computer science , search engine indexing , nosql , scalability , information retrieval , inverted index , joins , database , sql , table (database) , relational database management system , relational database , data mining , programming language
For any web application running on RDBMS databases as the backend, it might be a huge performance impact if a search needs to be performed on a table with millions of rows or if a query needs to be executed which joins multiple tables. In general, such kind of backend services make the website extremely slow. Document based reverse indexing can be a useful solution in these cases. SOLR is a standalone enterprise search server with a REST-like API. It has major features which include powerful full-text search, hit highlighting, faceted search, near real-time indexing, dynamic clustering, database integration, NoSQL features and rich document (e.g., Word, PDF and more) parsing, geospatial search, Security built in. Databases and SOLR have complementary strengths and weaknesses. SQL supports very simple wildcard-based text search with some simple normalization like matching upper case to lowercase. The problem is that these are full table scans. In SOLR all searchable words are stored in an "inverse index based", which searches orders of magnitude faster. However, designing this framework is quite challenging. This paper discusses the techniques that are highly reliable, scalable and fault tolerant which can help in setting up the distributed indexing, replication and load-balanced querying with a centralized configuration.