Event Detection and Classification in Tweets using Deep Learning

Malika Noui; Abdelaziz Lakhfif; Mohamed Amin Laouadi

doi:10.48084/etasr.9238

Authors

Malika Noui Department of Computer Science, Faculty of Sciences, Setif 1 University, Ferhat Abbas, Algeria
Abdelaziz Lakhfif Department of Computer Science, Faculty of Sciences, Setif 1 University, Ferhat Abbas, Algeria
Mohamed Amin Laouadi Department of Computer Science, Faculty of Sciences, Setif 1 University, Ferhat Abbas, Algeria

Volume: 15 | Issue: 1 | Pages: 19977-19982 | February 2025 | https://doi.org/10.48084/etasr.9238

Received: 14 October 2024 | Revised: 10 December 2024 | Accepted: 14 December 2024 | Online: 24 December 2024

Corresponding author: Malika Noui

Abstract

Online social networks have become important sources of information and contextual data in all areas of life, including finance, elections, social events, health, sports, etc. Recently, the detection and classification of useful events presented in tweets has attracted a lot of interest. However, due to the inherent challenges associated with the nature of the events to be detected or classified, traditional approaches have not yielded satisfactory results. The use of deep learning-based text word embedding representations, such as Word2Vec, GloVe, FastText, and BERT, has shown significant efficacy in improving detection performance by considering the semantic context. This study proposes a model that uses an LSTM stacked on top of BERT representations to effectively detect and classify events in tweets. To this end, a dataset of about 310,000 event-related tweets has been collected and categorized into 50 event types based on a selected set of representative keywords. Multiple experiments were carried out on the collected dataset to evaluate the performance of the proposed model. The proposed model attained an overall accuracy greater than 94.3% and an F1 score of more than 90%, achieving state-of-the-art results in the classification of most of the event categories.

Keywords:

useful event detection, social media data, deep learning, BERT, LSTM

Downloads

Download data is not yet available.

References

T. Sakaki, M. Okazaki, and Y. Matsuo, "Earthquake shakes Twitter users: real-time event detection by social sensors," in Proceedings of the 19th International conference on World Wide Web, Raleigh, NC, USA, Apr. 2010, pp. 851–860.

J. E. C. Saire and A. P. Briseño, "Text Mining Approach to Analyze Coronavirus Impact: Mexico City as Case of Study." medRxiv, Art. no. 2020.05.07.20094466, May 12, 2020.

A. Culotta, "Towards detecting influenza epidemics by analyzing Twitter messages," in Proceedings of the First Workshop on Social Media Analytics, Washington, DC, USA, Apr. 2010, pp. 115–122.

C. Machado et al., "1 News and Political Information Consumption in Brazil: Mapping the First Round of the 2018 Brazilian Presidential Election on Twitter," COMPROP, Data Memo 2018.4, Oct. 2018.

"Twitter Usage Statistics - Internet Live Stats." https://www.internetlivestats.com/twitter-statistics/.

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding." arXiv, May 24, 2019.

Y. Chen, L. Xu, K. Liu, D. Zeng, and J. Zhao, "Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks," in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, 2015, vol. 1, pp. 167–176.

H. Yan, X. Jin, X. Meng, J. Guo, and X. Cheng, "Event Detection with Multi-Order Graph Convolution and Aggregated Attention," in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 2019, pp. 5765–5769.

S. Liu, Y. Chen, K. Liu, and J. Zhao, "Exploiting Argument Information to Improve Event Detection via Supervised Attention Mechanisms," in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, 2017, vol. 1, pp. 1789–1798.

B. Ahmed, G. Ali, A. Hussain, A. Baseer, and J. Ahmed, "Analysis of Text Feature Extractors using Deep Learning on Fake News," Engineering, Technology & Applied Science Research, vol. 11, no. 2, pp. 7001–7005, Apr. 2021.

S. Hochreiter and J. Schmidhuber, "Long Short-Term Memory," Neural Computation, vol. 9, no. 8, pp. 1735–1780, Nov. 1997.

T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, "Distributed Representations of Words and Phrases and their Compositionality," in Advances in Neural Information Processing Systems, 2013, vol. 26, [Online]. Available: https://proceedings.neurips.cc/paper/2013/hash/9aa42b31882ec039965f3c4923ce901b-Abstract.html.

P. Badjatiya, S. Gupta, M. Gupta, and V. Varma, "Deep Learning for Hate Speech Detection in Tweets," in Proceedings of the 26th International Conference on World Wide Web Companion - WWW ’17 Companion, Perth, Australia, 2017, pp. 759–760.

A. Joulin, E. Grave, P. Bojanowski, and T. Mikolov, "Bag of Tricks for Efficient Text Classification." arXiv, Aug. 09, 2016.

Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut, "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations." arXiv, Feb. 09, 2020.

W. Antoun, F. Baly, and H. Hajj, "AraBERT: Transformer-based Model for Arabic Language Understanding." arXiv, Mar. 07, 2021.

Y. Liu et al., "RoBERTa: A Robustly Optimized BERT Pretraining Approach." arXiv, Jul. 26, 2019.

X. Wang et al., "MAVEN: A Massive General Domain Event Detection Dataset." arXiv, Oct. 08, 2020.

F. Yao et al., "LEVEN: A Large-Scale Chinese Legal Event Detection Dataset." arXiv, Mar. 16, 2022.

A. Lakhfif and M. T. Laskri, "A frame-based approach for capturing semantics from Arabic text for text-to-sign language MT," International Journal of Speech Technology, vol. 19, no. 2, pp. 203–228, Jun. 2016.

A. J. McMinn, Y. Moshfeghi, and J. M. Jose, "Building a large-scale corpus for evaluating event detection on twitter," in Proceedings of the 22nd ACM international conference on Information & Knowledge Management, San Francisco, CA, USA, Oct. 2013, pp. 409–418.

Walker, Christopher, Strassel, Stephanie, Medero, Julie, and Maeda, Kazuaki, "ACE 2005 Multilingual Training Corpus." Linguistic Data Consortium, Art. no. 1572864 KB, Feb. 15, 2006.

Vol. 15 (2025)	Vol. 7 (2017)
Vol. 14 (2024)	Vol. 6 (2016)
Vol. 13 (2023)	Vol. 5 (2015)
Vol. 12 (2022)	Vol. 4 (2014)
Vol. 11 (2021)	Vol. 3 (2013)
Vol. 10 (2020)	Vol. 2 (2012)
Vol. 9 (2019)	Vol. 1 (2011)
Vol. 8 (2018)

Event Detection and Classification in Tweets using Deep Learning

Authors

Abstract

Keywords:

Downloads

References

Downloads

How to Cite

Metrics

License