
Novel Dataset Generation for Indian Brinjal Plant Using Image Data Augmentation
Author(s) -
Balwant Gorad,
S. Kotrappa
Publication year - 2021
Publication title -
iop conference series. materials science and engineering
Language(s) - English
Resource type - Journals
eISSN - 1757-899X
pISSN - 1757-8981
DOI - 10.1088/1757-899x/1065/1/012041
Subject(s) - overfitting , computer science , artificial intelligence , deep learning , zoom , the internet , field (mathematics) , machine learning , data mining , scale invariant feature transform , quality (philosophy) , image (mathematics) , pattern recognition (psychology) , artificial neural network , mathematics , philosophy , epistemology , world wide web , petroleum engineering , pure mathematics , engineering , lens (geology)
Machine learning and deep learning have performed outstandingly in many computer tasks to tackle various real-world problems like disease prediction on plants etc. But unfortunately, there are many research areas like disease prediction in the agricultural crops where there is a lack of large-good quality real-world datasets. One way to solve such a problem is by using an available dataset from the internet. The problem of using an available dataset from the internet creates lots of issues. Major issues are using the dataset from different geographical locations which are deployed at other location, model overfitting due to small-sized dataset etc. The main purpose and experimentation done in this research paper are presenting different techniques to increase the size of the Indian Brinjal dataset so that deep learning models can be improved. Here data augmentation techniques to enhance the small-sized image dataset using rotation, channel shift, width shift, height shift, shear transform, brightness, scaling, uniform aspect ratio, zoom, horizontal flipping, and vertical flipping methods are used. At last, a huge high-quality training dataset of size 39,010 is generated from 350 sample images taken from the real field and 1356 high-quality images are generated to validate the model using above mentioned data augmentation techniques.