Open Access
An Ensemble Architecture Based on Deep Learning Model for Click Fraud Detection in Pay-Per-Click Advertisement Campaign
Author(s) -
Amreen Batool,
Yung-Cheol Byun
Publication year - 2022
Publication title -
ieee access
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.587
H-Index - 127
ISSN - 2169-3536
DOI - 10.1109/access.2022.3211528
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
With the rapid development of online advertising, click fraud is a serious issue for the internet market. Click fraud is a dishonest attempt to improve a website’s profit or deplete an advertiser’s budget by clicking on pay-per-click advertisements. For an extended period, this illegal act has a threat to the industrial sectors. As a result, these businesses hesitate to advertise their items on mobile apps and websites, as numerous groups attempt to take advantage of themes. To safely advertise their services and products online, a robust mechanism is needed for efficient click fraud detection. To tackle this issue, an ensemble architecture of machine learning and deep learning is proposed to detect click fraud in online advertisement campaigns. The proposed ensemble architecture consists of a Convolutional Neural Network (CNN), and a Bidirectional Long Short-Term Memory network (BiLSTM) is used to extract hidden features, while the Random Forest (RF) is used for classification. The main objective of the proposed research study is to develop a hybrid DL model for automatic feature extraction from clicks data and then process through an RF classifier into two classes, such as fraudulent and non-fraudulent clicks. Furthermore, a preprocessing module is developed to preprocess data by dealing with categorical attributes and imbalanced data to enhance the reliability and consistency of the clicks data. In addition, different evaluation criteria are used to evaluate and compare the performance of the proposed CNN-BiLSTM-RF with the ensemble and standalone models. The experimental results indicate that our ensemble architecture achieved the accuracy of 99.19 ± 0.08%, precision 99.89 ± 0.03%, sensitivity 98.50 ± 0.11%, F1-score 99.19 ± 0.08% and specificity 99.89 ± 0.03%. Furthermore, our proposed architecture produced superior results compared to other developed ensemble and conventional models. Moreover, our proposed ensemble architecture can be used as a safeguard against click fraud for pay-per-click advertising to facilitate industries for the safe and reliable promotion of their products.