Leveraging Convolutional Neural Network (CNN)-based Auto Encoders for Enhanced Anomaly Detection in High-Dimensional Datasets
Received: 8 August 2024 | Revised: 29 August 2024 and 20 September 2024 | Accepted: 22 September 2024 | Online: 5 October 2024
Corresponding author: Hamayun Khan
Abstract
This study presents an Auto-Encoder Convolutional Neural Network (AECNNs) approach for anomaly detection in high-dimensional datasets. Unsupervised learning-based algorithms have a strong theoretical foundation and are widely used for anomaly detection in high-dimensional datasets, but some limitations significantly reduce their performance. This study proposes an algorithm to address these limitations. The proposed AECNN combines various convolutional layers, feature extraction, dimensionality reduction, and data preprocessing and was evaluated using accuracy, precision, recall, and F1-score. The performance of the proposed model was evaluated using a large real benchmark dataset. The proposed CNN-based autoencoder distinguished anomalies with an AUC score of 0.83 and remarkable accuracy, precision, recall, and F1 score.
Keywords:
autoencoders, anomaly detection, high-dimensional data, machine learning, data analysis, model evaluation, Convolutional Neural Networks (CNNs), NSL-KDD, UNSW-NB15, MSEDownloads
References
M. I. H. Okfie and S. Mishra, "Anomaly Detection in IIoT Transactions using Machine Learning: A Lightweight Blockchain-based Approach," Engineering, Technology & Applied Science Research, vol. 14, no. 3, pp. 14645–14653, Jun. 2024.
P. More and P. Mishra, "Enhanced-PCA based Dimensionality Reduction and Feature Selection for Real-Time Network Threat Detection," Engineering, Technology & Applied Science Research, vol. 10, no. 5, pp. 6270–6275, Oct. 2020.
V. Chandola, A. Banerjee, and V. Kumar, "Anomaly detection: A survey," ACM Computing Surveys, vol. 41, no. 3, Apr. 2009, Art. no. 15.
V. Hodge and J. Austin, "A Survey of Outlier Detection Methodologies," Artificial Intelligence Review, vol. 22, no. 2, pp. 85–126, Oct. 2004.
M. H. Bhuyan, D. K. Bhattacharyya, and J. K. Kalita, "Network Anomaly Detection: Methods, Systems and Tools," IEEE Communications Surveys & Tutorials, vol. 16, no. 1, pp. 303–336, 2014.
C. Zhou and R. C. Paffenroth, "Anomaly Detection with Robust Deep Autoencoders," in Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, Canada, Aug. 2017, pp. 665–674.
C. Baur, B. Wiestler, S. Albarqouni, and N. Navab, "Deep Autoencoding Models for Unsupervised Anomaly Segmentation in Brain MR Images," in Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, Granada, Spain, 2019, pp. 161–169.
M. Sakurada and T. Yairi, "Anomaly Detection Using Autoencoders with Nonlinear Dimensionality Reduction," in Proceedings of the MLSDA 2014 2nd Workshop on Machine Learning for Sensory Data Analysis, Gold Coast, Australia, Dec. 2014, pp. 4–11.
J. An and S. Cho, "Variational autoencoder based anomaly detection using reconstruction probability," SNU Data Mining Center, Special Lecture on IE, 2015.
G. E. Hinton and R. R. Salakhutdinov, "Reducing the Dimensionality of Data with Neural Networks," Science, vol. 313, no. 5786, pp. 504–507, Jul. 2006.
M. H. H. Khairi, S. H. S. Ariffin, N. M. A. Latiff, A. S. Abdullah, and M. K. Hassan, "A Review of Anomaly Detection Techniques and Distributed Denial of Service (DDoS) on Software Defined Network (SDN)," Engineering, Technology & Applied Science Research, vol. 8, no. 2, pp. 2724–2730, Apr. 2018.
J. Masci, U. Meier, D. Cireşan, and J. Schmidhuber, "Stacked Convolutional Auto-Encoders for Hierarchical Feature Extraction," in Artificial Neural Networks and Machine Learning – ICANN 2011, Espoo, Finland, 2011, pp. 52–59.
U. Khan, K. Khan, F. Hassan, A. Siddiqui, and M. Afaq, "Towards Achieving Machine Comprehension Using Deep Learning on Non-GPU Machines," Engineering, Technology & Applied Science Research, vol. 9, no. 4, pp. 4423–4427, Aug. 2019.
P. Wagner, N. Strodthoff, R.-D. Bousseljot, W. Samek, and T. Schaeffter, "PTB-XL, a large publicly available electrocardiography dataset." PhysioNet, https://doi.org/10.13026/X4TD-X982.
D. M. Powers, "Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation," Journal of Machine Learning Technologies, vol. 1, no. 1, pp. 37–63, 2011.
Downloads
How to Cite
License
Copyright (c) 2024 M. Aetsam Javed, Madiha Anjum, Hassan A. Ahmed, Arshad Ali, H. M. Shahzad, Hamayun Khan, Abdulaziz M. Alshahrani
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.