z-logo
open-access-imgOpen Access
A Hybrid Technique for Health Insurance Fraud Detection on Highly Imbalanced Dataset
Author(s) -
V Ilango
Publication year - 2019
Publication title -
international journal of engineering and advanced technology
Language(s) - English
Resource type - Journals
ISSN - 2249-8958
DOI - 10.35940/ijrte.f1210.0886s19
Subject(s) - random forest , computer science , data mining , insurance fraud , task (project management) , machine learning , artificial intelligence , engineering , finance , business , systems engineering
Health Insurance industry is producing a massive amount of heterogeneous data. Detecting fraud from these data is a challenging task. Highly imbalanced data causes huge challenge to the Insurance Data Analysis. Classification of imbalanced data is a critical issue faced by the fraud detection methodologies. Fraud only covers less than 10% of the whole data. In this study, we use highly imbalanced data and propose a hybrid method for fixing class imbalance problem by using a combination of SMOTE, Cross Validation, and Random Forest. We used Medicare data, which will be applied to various sampling techniques, and further a classification model was built. We observed that SMOTE with Random forest with cross validation produced excellent results. Our model should be capable of identifying all the relevant(fraud) instances, i.e., the model should have a high recall value. SMOTE with Random forest had average recall of 86% and an overall accuracy of 90%, which could be considered as good among the existing models.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here