A Context-Enhanced Model for Fake News Detection
Received: 7 October 2024 | Revised: 12 November 2024 | Accepted: 16 November 2024 | Online: 22 December 2024
Corresponding author: Majdi Beseiso
Abstract
News published on social networks has a notable impact on changing people's perceptions on various topics. However, all news available on social media may not be genuine and might come from unverified sources. The prevalence of fake news is an inevitable concern that needs to be addressed effectively. This study presents an ensemble algorithm to improve fake news detection tools. Long-Short-Term-Memory (LSTM) and an ensemble of LSTM and Convolutional Neural Networks (CNN) were used. The proposed model used bidirectional LSTM layers and CNN convolutional 2D layers with kernel sizes of 2, 3, and 4 for 2-gram, 3-gram, and 4-gram tokens. The results obtained show an accuracy of 96.7% and 97.3% on a fake news dataset using the LSTM model and CNN-LSTM model, respectively, significantly improved from the maximum accuracy of 94.88% reported in a previous study. Embedding layers yielded significant improvements when paired with extended word sequences and pre-trained embedding vectors. Diverse tokenization methods with and without pre-trained embedding layers were also considered. The ensemble model achieved a 10.03% improvement in predictive accuracy on the Liar dataset, compared to the 6.08% improvement reported in a previous study using the same dataset.
Keywords:
convolutional neural networks, fake news, word2vec embedding, natural language processing, long short-term memoryDownloads
References
R. K. Nielsen, "News media, search engines and social networking SITES as varieties of online gatekeepers," in Rethinking Journalism Again, Routledge, 2016.
M. Faraon, A. Jaff, L. P. Nepomuceno, and V. Villavicencio, "Fake News and Aggregated Credibility: Conceptualizing a Co-Creative Medium for Evaluation of Sources Online," International Journal of Ambient Computing and Intelligence (IJACI), vol. 11, no. 4, pp. 93–117, Oct. 2020.
E. Cueva, G. Ee, A. Iyer, A. Pereira, A. Roseman, and D. Martinez, "Detecting Fake News on Twitter Using Machine Learning Models," in 2020 IEEE MIT Undergraduate Research Technology Conference (URTC), Cambridge, MA, USA, Oct. 2020, pp. 1–5.
F. A. Ozbay and B. Alatas, "Fake news detection within online social media using supervised artificial intelligence algorithms," Physica A: Statistical Mechanics and its Applications, vol. 540, Feb. 2020, Art. no. 123174.
M. Nirav Shah and A. Ganatra, "A systematic literature review and existing challenges toward fake news detection models," Social Network Analysis and Mining, vol. 12, no. 1, Nov. 2022, Art. no. 168.
V. Kumar, A. Kumar, A. K. Singh, and A. Pachauri, "Fake News Detection using Machine Learning and Natural Language Processing," in 2021 International Conference on Technological Advancements and Innovations (ICTAI), Tashkent, Uzbekistan, Nov. 2021, pp. 547–552.
I. Q. Abduljaleel and I. H. Ali, "Deep Learning and Fusion Mechanism-based Multimodal Fake News Detection Methodologies: A Review," Engineering, Technology & Applied Science Research, vol. 14, no. 4, pp. 15665–15675, Aug. 2024.
K. Agarwalla, S. Nandan, V. A. Nair, and D. D. Hema, “Fake News Detection using Machine Learning and Natural Language Processing,” International Journal of Recent Technology and Engineering, vol. 7, no. 6, pp. 844-847, 2019.
"Fake News detection." https://www.kaggle.com/datasets/jruvika/fake-news-detection.
H. E. Wynne and Z. Z. Wint, "Content Based Fake News Detection Using N-Gram Models," in Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services, Munich, Germany, Dec. 2019, pp. 669–673.
S. Vijayaraghavan et al., "Fake News Detection with Different Models." arXiv, Feb. 15, 2020.
A. Shah, "Distinguishing fake and real news of twitter data with the help of machine learning techniques," M.S. Thesis, Laurential University, Sudbury, Canada, 2021.
L. Singh, "Fake News Detection: a comparison between available Deep Learning techniques in vector space," in 2020 IEEE 4th Conference on Information & Communication Technology (CICT), Chennai, India, Dec. 2020, pp. 1–4.
W. Y. Wang, "‘Liar, Liar Pants on Fire’: A New Benchmark Dataset for Fake News Detection." arXiv, May 01, 2017.
C. O. Truică, E. S. Apostol, R. C. Nicolescu, and P. Karras, "MCWDST: A Minimum-Cost Weighted Directed Spanning Tree Algorithm for Real-Time Fake News Mitigation in Social Media," IEEE Access, vol. 11, pp. 125861–125873, 2023.
E. Aljohani, "Enhancing Arabic Fake News Detection: Evaluating Data Balancing Techniques Across Multiple Machine Learning Models," Engineering, Technology & Applied Science Research, vol. 14, no. 4, pp. 15947–15956, Aug. 2024.
N. Rai, D. Kumar, N. Kaushik, C. Raj, and A. Ali, "Fake News Classification using transformer based enhanced LSTM and BERT," International Journal of Cognitive Computing in Engineering, vol. 3, pp. 98–105, Jun. 2022.
C. O. Truică, E. S. Apostol, and A. Paschke, "Awakened at CheckThat! 2022: Fake News Detection using BiLSTM and Sentence Transformer," presented at the CLEF 2022: Conference and Labs of the Evaluation Forum, Bologna, Italy, Sep. 2022.
C. O. Truică and E. S. Apostol, "It’s All in the Embedding! Fake News Detection Using Document Embeddings," Mathematics, vol. 11, no. 3, Jan. 2023, Art. no. 508.
J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding." arXiv, May 24, 2019.
V. Sanh, L. Debut, J. Chaumond, and T. Wolf, "DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter." arXiv, 2019.
Y. Liu et al., "RoBERTa: A Robustly Optimized BERT Pretraining Approach." arXiv, Jul. 26, 2019.
A. Conneau et al., "Unsupervised Cross-lingual Representation Learning at Scale." arXiv, Apr. 08, 2020.
R. K. Kaliyar, A. Goswami, and P. Narang, "FakeBERT: Fake news detection in social media with a BERT-based deep learning approach," Multimedia Tools and Applications, vol. 80, no. 8, pp. 11765–11788, Mar. 2021.
T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, "Distributed Representations of Words and Phrases and their Compositionality." arXiv, 2013.
W. Lifferth, "Fake News." https://kaggle.com/fake-news.
S. Hochreiter and J. Schmidhuber, "Long Short-Term Memory," Neural Computation, vol. 9, no. 8, pp. 1735–1780, Aug. 1997.
V. I. Ilie, C. O. Truica, E. S. Apostol, and A. Paschke, "Context-Aware Misinformation Detection: A Benchmark of Deep Learning Architectures Using Word Embeddings," IEEE Access, vol. 9, pp. 162122–162146, 2021.
Downloads
How to Cite
License
Copyright (c) 2024 Majdi Beseiso, Saleh Al-Zahrani
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.