
A Keyword Based Educational and Non-Educational Website Recognition Tool
Publication year - 2019
Publication title -
international journal of innovative technology and exploring engineering
Language(s) - English
Resource type - Journals
ISSN - 2278-3075
DOI - 10.35940/ijitee.j1077.08810s19
Subject(s) - computer science , the internet , filter (signal processing) , artificial intelligence , sentence , context (archaeology) , natural language processing , meaning (existential) , stop words , support vector machine , world wide web , information retrieval , preprocessor , psychology , paleontology , psychotherapist , computer vision , biology
Today we all depend upon internet to do our daily activities. For booking hotel, air tickets, finding particular places, travelling, cooking, education, banking, etc. we require internet. To get a specific thing immediately, we require filtering tools. E-learning is a new and rapidly growing media in modern education system, which is totally based upon internet. While surfing on internet students may get distracted from offensive and irrelevant websites. In avoiding such distractions, filters play a vital role. This paper proposes a filter tool which carries out web scraping of text data, data cleaning, Natural language processing and filtering the non-learning sites in real-time. We have collected the text from paragraphs, images and video tags. This extracted textual data is in the form of sentences, which are processed part of speech (POS) by NLP. In NLP we are using WSD method to find the exact meaning of the ambiguous words in that context. This tool creates a knowledge base of student related sites using NLP and SVM classification technique. Word sense disambiguation is used to find the correct senses of those words, in the present sentence, which may have multiple meanings. We have created a keyword database of all learning sites. Lastly, we are classifying the sites in two categories learning and non-learning using Support Vector Machine in this tool