Deep Representation Learning for Effective Clustering of Short Persian Texts

Mahdi Molaei; Mohammad-Reza Feizi-Derakhshi; Ali-Akbar Rasooly; Cina Motamed

doi:10.48084/etasr.10984

Authors

Mahdi Molaei Computerized Intelligence Systems Laboratory, Department of Computer Engineering, University of Tabriz, Iran
Mohammad-Reza Feizi-Derakhshi Computerized Intelligence Systems Laboratory, Department of Computer Engineering, University of Tabriz, Iran | College of Engineering, Uruk University, Baghdad, Iraq
Ali-Akbar Rasooly Software Development Laboratory, Pazhouhesh Afzar Farda Company (PAFCO), Tehran, Iran
Cina Motamed Laboratoire PRISME, University of Orleans, France

Volume: 15 | Issue: 3 | Pages: 23405-23414 | June 2025 | https://doi.org/10.48084/etasr.10984

Received: 16 March 2025 | Revised: 7 April 2025 | Accepted: 12 April 2025 | Online: 29 May 2025

Corresponding author: Mohammad-Reza Feizi-Derakhshi

Abstract

Short text clustering poses several challenges because of the limited contextual information available, especially for low-resource languages such as Persian. This study proposes a novel deep clustering architecture that consists of an RNN-based autoencoder to learn the latent representation of the text in preserving the rich structural features. This architecture involves a second network, the Representation network, to maximize the existing distance between clusters, minimize the overall cluster overlapping, and improve clustering in the latent space. The two-phase training approach first involved training using autoencoder reconstruction loss and then jointly optimizing for improved cluster separation. Experiments with different embedding types were carried out, and the evaluation results showed that the proposed method outperformed previous approaches. The proposed model provides an impactful advancement in representation learning and training for the short-text domain.

Keywords:

short text clustering, RNN-based autoencoder, deep representation learning

References

J. MacQueen, "Some methods for classification and analysis of multivariate observations," in Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics, 1967, vol. 5, pp. 281–298.

D. Birant and A. Kut, "ST-DBSCAN: An algorithm for clustering spatial–temporal data," Data & Knowledge Engineering, vol. 60, no. 1, pp. 208–221, Jan. 2007. DOI: https://doi.org/10.1016/j.datak.2006.01.013

C. M. Bishop, Pattern Recognition and Machine Learning. Springer, 2006.

M. Steinbach, L. Ertöz, and V. Kumar, "The Challenges of Clustering High Dimensional Data," in New Directions in Statistical Physics: Econophysics, Bioinformatics, and Pattern Recognition, L. T. Wille, Ed. Springer, 2004, pp. 273–309. DOI: https://doi.org/10.1007/978-3-662-08968-2_16

J. Xie, R. Girshick, and A. Farhadi, "Unsupervised Deep Embedding for Clustering Analysis," in Proceedings of The 33rd International Conference on Machine Learning, Jun. 2016, pp. 478–487.

X. Guo, L. Gao, X. Liu, and J. Yin, "Improved deep embedded clustering with local structure preservation," in Proceedings of the 26th International Joint Conference on Artificial Intelligence, Melbourne, Australia, May 2017, pp. 1753–1759. DOI: https://doi.org/10.24963/ijcai.2017/243

B. Yang, X. Fu, N. D. Sidiropoulos, and M. Hong, "Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering," in Proceedings of the 34th International Conference on Machine Learning, Jul. 2017, pp. 3861–3870.

J. Yang, D. Parikh, and D. Batra, "Joint Unsupervised Learning of Deep Representations and Image Clusters," in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, Jun. 2016, pp. 5147–5156. DOI: https://doi.org/10.1109/CVPR.2016.556

J. Chang, L. Wang, G. Meng, S. Xiang, and C. Pan, "Deep Adaptive Image Clustering," in 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, Oct. 2017, pp. 5880–5888. DOI: https://doi.org/10.1109/ICCV.2017.626

A. Hadifar, L. Sterckx, T. Demeester, and C. Develder, "A Self-Training Approach for Short Text Clustering," in Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019), Florence, Italy, 2019, pp. 194–199. DOI: https://doi.org/10.18653/v1/W19-4322

K. Zhang, Z. Lian, J. Li, H. Li, and X. Hu, "Short Text Clustering with a Deep Multi-embedded Self-supervised Model," in Artificial Neural Networks and Machine Learning – ICANN 2021, 2021, pp. 150–161. DOI: https://doi.org/10.1007/978-3-030-86383-8_12

M. Hao, W. Wang, and F. Zhou, "Joint Representations of Texts and Labels with Compositional Loss for Short Text Classification," Journal of Web Engineering, vol. 20, no. 3, pp. 669–688, Feb. 2021. DOI: https://doi.org/10.13052/jwe1540-9589.2035

H. Yin, X. Song, S. Yang, G. Huang, and J. Li, "Representation Learning for Short Text Clustering," in Web Information Systems Engineering – WISE 2021, 2021, pp. 321–335. DOI: https://doi.org/10.1007/978-3-030-91560-5_23

C. Wei, L. Zhu, and J. Shi, "Short Text Embedding Autoencoders With Attention-Based Neighborhood Preservation," IEEE Access, vol. 8, pp. 223156–223171, 2020. DOI: https://doi.org/10.1109/ACCESS.2020.3042778

P. Dahal, "Learning Embedding Space for Clustering From Deep Representations," in 2018 IEEE International Conference on Big Data (Big Data), Seattle, WA, USA, Dec. 2018, pp. 3747–3755. DOI: https://doi.org/10.1109/BigData.2018.8622629

Y. Ren et al., "Deep Clustering: A Comprehensive Survey," IEEE Transactions on Neural Networks and Learning Systems, vol. 36, no. 4, pp. 5858–5878, Apr. 2025. DOI: https://doi.org/10.1109/TNNLS.2024.3403155

S. Zhou et al., "A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and Future Directions," ACM Computing Surveys, vol. 57, no. 3, Aug. 2024, Art. no. 69. DOI: https://doi.org/10.1145/3689036

D. Chen, J. Lv, and Z. Yi, "Unsupervised Multi-Manifold Clustering by Learning Deep Representation," presented at the The AAAI-17 Workshop on Crowdsourcing, Deep Learning, and Artificial Intelligence Agents, 2017.

F. Li, H. Qiao, and B. Zhang, "Discriminatively boosted image clustering with fully convolutional auto-encoders," Pattern Recognition, vol. 83, pp. 161–173, Nov. 2018. DOI: https://doi.org/10.1016/j.patcog.2018.05.019

M. Kumar, B. Packer, and D. Koller, "Self-Paced Learning for Latent Variable Models," in Advances in Neural Information Processing Systems, 2010, vol. 23.

Y. Ren, N. Wang, M. Li, and Z. Xu, "Deep density-based image clustering," Knowledge-Based Systems, vol. 197, Jun. 2020, Art. no. 105841. DOI: https://doi.org/10.1016/j.knosys.2020.105841

X. Yang, C. Deng, F. Zheng, J. Yan, and W. Liu, "Deep Spectral Clustering Using Dual Autoencoder Network," in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, Jun. 2019, pp. 4061–4070. DOI: https://doi.org/10.1109/CVPR.2019.00419

S. Affeldt, L. Labiod, and M. Nadif, "Spectral clustering via ensemble deep autoencoder learning (SC-EDAE)," Pattern Recognition, vol. 108, Dec. 2020, Art. no. 107522. DOI: https://doi.org/10.1016/j.patcog.2020.107522

S. Hosseini and Z. A. Varzaneh, "Deep text clustering using stacked AutoEncoder," Multimedia Tools and Applications, vol. 81, no. 8, pp. 10861–10881, Mar. 2022. DOI: https://doi.org/10.1007/s11042-022-12155-0

M. Farahani, M. Gharachorloo, M. Farahani, and M. Manthouri, "ParsBERT: Transformer-based Model for Persian Language Understanding," Neural Processing Letters, vol. 53, no. 6, pp. 3831–3847, Dec. 2021. DOI: https://doi.org/10.1007/s11063-021-10528-4

E. Zafarani-Moattar, M. R. Kangavari, and A. M. Rahmani, "Neural Network Meaningful Learning Theory and its Application for Deep Text Clustering," IEEE Access, vol. 12, pp. 42411–42422, 2024. DOI: https://doi.org/10.1109/ACCESS.2024.3375754

M. Molaei and D. Mohamadpur, "Distributed Online Pre-Processing Framework for Big Data Sentiment Analytics," Journal of AI and Data Mining, vol. 10, no. 2, pp. 197–205, Apr. 2022.

J. Pennington, R. Socher, and C. D. Manning, "Glove: Global vectors for word representation," in Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014, pp. 1532–1543. DOI: https://doi.org/10.3115/v1/D14-1162

P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, "Enriching Word Vectors with Subword Information," Transactions of the Association for Computational Linguistics, vol. 5, pp. 135–146, Jun. 2017. DOI: https://doi.org/10.1162/tacl_a_00051

P. Vincent, H. Larochelle, Y. Bengio, and P. A. Manzagol, "Extracting and composing robust features with denoising autoencoders," in Proceedings of the 25th international conference on Machine learning, Apr. 2008, pp. 1096–1103. DOI: https://doi.org/10.1145/1390156.1390294

M. Abadi et al., "TensorFlow: A System for Large-Scale Machine Learning," presented at the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), 2016, pp. 265–283. [Online]. Available: https://www.usenix.org/conference/osdi16/

technical-sessions/presentation/abadi.

M. Ranjbar-Khadivi, M.-R. Feizi-Derakhshi, A. Forouzandeh, P. Gholami, A.-R. Feizi-Derakhshi, and E. Zafarani-Moattar, "Sep_TD_Tel01." Mendeley, Jan. 6, 2022.

P. Gholami-Dastgerdi, M. R. Feizi-Derakhshi, and P. Salehpour, "SSKG: Subject stream knowledge graph, a new approach for event detection from text," Ain Shams Engineering Journal, vol. 15, no. 12, Dec. 2024, Art. no. 103040. DOI: https://doi.org/10.1016/j.asej.2024.103040

M. Ranjbar-Khadivi, S. Akbarpour, M. R. Feizi-Derakhshi, and B. Anari, "A Human Word Association Based Model for Topic Detection in Social Networks," Annals of Data Science, Jul. 2024. DOI: https://doi.org/10.1007/s40745-024-00561-0

A. Forouzandeh, M. R. Feizi-Derakhshi, and P. Gholami-Dastgerdi, "Persian Named Entity Recognition by Gray Wolf Optimization Algorithm," Scientific Programming, vol. 2022, no. 1, 2022, Art. no. 6368709. DOI: https://doi.org/10.1155/2022/6368709

X. Liu et al., "Emotion classification for short texts: an improved multi-label method," Humanities and Social Sciences Communications, vol. 10, no. 1, pp. 1–9, Jun. 2023. DOI: https://doi.org/10.1057/s41599-023-01816-6

X. Liu et al., "Developing Multi-Labelled Corpus of Twitter Short Texts: A Semi-Automatic Method," Systems, vol. 11, no. 8, Aug. 2023, Art. no. 390. DOI: https://doi.org/10.3390/systems11080390

X. Glorot, A. Bordes, and Y. Bengio, "Deep Sparse Rectifier Neural Networks," in Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, Jun. 2011, pp. 315–323.

I. Sutskever, J. Martens, G. Dahl, and G. Hinton, "On the importance of initialization and momentum in deep learning," in Proceedings of the 30th International Conference on Machine Learning, May 2013, pp. 1139–1147.

D. P. Kingma and J. Ba, "Adam: A Method for Stochastic Optimization." arXiv, Jan. 30, 2017.

H. W. Kuhn, "The Hungarian method for the assignment problem," Naval Research Logistics Quarterly, vol. 2, no. 1–2, pp. 83–97, 1955. DOI: https://doi.org/10.1002/nav.3800020109

A. Alqahtani, H. Alhakami, T. Alsubait, and A. Baz, "A Survey of Text Matching Techniques," Engineering, Technology & Applied Science Research, vol. 11, no. 1, pp. 6656–6661, Feb. 2021. DOI: https://doi.org/10.48084/etasr.3968

S. Rezaei et al., "An experimental study of sentiment classification using deep-based models with various word embedding techniques," Journal of Experimental & Theoretical Artificial Intelligence.