Classification of Requirements from Software Requirement Specification Documents
TAOS: A Novel TextAttack Oversampling Approach for Fine-Grained Classification of Software Requirements
Corresponding author: Boulbaba Ben Ammar
Abstract
Accurate classification of software requirements is a crucial task in Software Engineering (SE) that prioritizes development efforts and ensures the holistic quality of the system, encompassing both Functional Requirements (FRs) and Non-Functional Requirements (NFRs). While the majority of requirements classification research has so far focused on binary classification, fine-grained multi-class classification still encounters the challenge of extreme class imbalance in requirements datasets. To mitigate this limitation, we present TextAttack Oversampling (TAOS), a novel method that utilizes a Natural Language Processing (NLP)-based text augmentation technique to address this imbalance, thus reducing dependence on expensive expert labeling. An empirical assessment of a 12-class requirements dataset indicates that TAOS considerably outperforms standard classification techniques. Our approach achieves a 24% gain in F1-score, increasing from 0.75 to 0.93, and effectively improves the performance on minority requirement classes that were previously undetectable. This study demonstrates the effectiveness of context-aware text augmentation in improving multi-class requirements classification, thus providing a proven method to enhance the reliability and usefulness of automated requirements analysis and management tools for software engineers.
Keywords:
software requirements specification, requirements engineering, text classification, Natural Language Processing (NLP), Bidirectional Encoder Representations from Transformers (BERT), data augmentation, class imbalanceDownloads
References
Z. Kurtanović and W. Maalej, "Automatically Classifying Functional and Non-functional Requirements Using Supervised Machine Learning," in 2017 IEEE 25th International Requirements Engineering Conference, Lisbon, Portugal, 2017, pp. 490–495. DOI: https://doi.org/10.1109/RE.2017.82
E. Dias Canedo and B. Cordeiro Mendes, "Software Requirements Classification Using Machine Learning Algorithms," Entropy, vol. 22, no. 9, Sep. 2020, Art. no. 1057. DOI: https://doi.org/10.3390/e22091057
N. Rahimi, F. Eassa, and L. Elrefaei, "One- and Two-Phase Software Requirement Classification Using Ensemble Deep Learning," Entropy, vol. 23, no. 10, Oct. 2021, Art. no. 1264. DOI: https://doi.org/10.3390/e23101264
J. Cleland-Huang, R. Settimi, X. Zou, and P. Solc, "Automated classification of non-functional requirements," Requirements Engineering, vol. 12, no. 2, pp. 103–120, Apr. 2007. DOI: https://doi.org/10.1007/s00766-007-0045-1
A. Rashwan, O. Ormandjieva, and R. Witte, "Ontology-Based Classification of Non-functional Requirements in Software Specifications: A New Corpus and SVM-Based Classifier," in 2013 IEEE 37th Annual Computer Software and Applications Conference, Kyoto, Japan, 2013, pp. 381–386. DOI: https://doi.org/10.1109/COMPSAC.2013.64
J. M. Johnson and T. M. Khoshgoftaar, "Survey on deep learning with class imbalance," Journal of Big Data, vol. 6, no. 1, Mar. 2019, Art. no. 27. DOI: https://doi.org/10.1186/s40537-019-0192-5
Z. S. Rubaidi, B. B. Ammar, and M. B. Aouicha, "Handling Imbalance Functional and Non-Functional Software Requirement Classification Based on Machine Learning Algorithms," in 23rd International Conference on Hybrid Intelligent Systems, Volume 4: Machine Learning Applications, Olten, Switzerland; Porto, Portugal; Kaunas, Lithuania; Greater Noida, India; Kochi, India, 2023, pp. 199–209. DOI: https://doi.org/10.1007/978-3-031-78934-2_19
B. Or, "Improving Requirements Classification with SMOTE-Tomek Preprocessing." arXiv, Jan. 11, 2025.
I. Khurshid et al., "Classification of Non-Functional Requirements From IoT Oriented Healthcare Requirement Document," Frontiers in Public Health, vol. 10, Mar. 2022, Art. no. 860536. DOI: https://doi.org/10.3389/fpubh.2022.860536
A. A. A. Althanoon and Y. S. Younis, "Supporting Classification of Software Requirements system Using Intelligent Technologies Algorithms," Technium: Romanian Journal of Applied Sciences and Technology, vol. 3, no. 11, pp. 32–39, Dec. 2021. DOI: https://doi.org/10.47577/technium.v3i11.5417
L. Kumar, S. Baldwa, S. M. Jambavalikar, L. B. Murthy, and A. Krishna, "Software Functional and Non-function Requirement Classification Using Word-Embedding," in Proceedings of the 36th International Conference on Advanced Information Networking and Applications, Volume 2, Sydney, Australia, 2022, pp. 167–179. DOI: https://doi.org/10.1007/978-3-030-99587-4_15
G. Y. Quba, H. Al Qaisi, A. Althunibat, and S. AlZu'bi, "Software Requirements Classification using Machine Learning algorithm's," in 2021 International Conference on Information Technology, Amman, Jordan, 2021, pp. 685–690. DOI: https://doi.org/10.1109/ICIT52682.2021.9491688
S. Vijayvargiya, L. Kumar, L. B. Murthy, and S. Misra, "Software Requirements Classification using Deep-learning Approach with Various Hidden Layers," in 2022 17th Conference on Computer Science and Intelligence Systems, Sofia, Bulgaria, 2022, pp. 895–904. DOI: https://doi.org/10.15439/2022F140
F. Khayashi, B. Jamasb, R. Akbari, and P. Shamsinejadbabaki, "Deep Learning Methods for Software Requirement Classification: A Performance Study on the PURE dataset." arXiv, Nov. 10, 2022.
O. AlDhafer, I. Ahmad, and S. Mahmood, "An end-to-end deep learning system for requirements classification using recurrent neural networks," Information and Software Technology, vol. 147, Jul. 2022, Art. no. 106877. DOI: https://doi.org/10.1016/j.infsof.2022.106877
M. A. F. Saroth, P. M. A. K. Wijerathne, and B. T. G. S. Kumara, "Automatic Multi-Class Non-Functional Software Requirements Classification Using Machine Learning Algorithms," in 2024 International Research Conference on Smart Computing and Systems Engineering, Colombo, Sri Lanka, 2024, vol. 7, pp. 1–6. DOI: https://doi.org/10.1109/SCSE61872.2024.10550526
Z. Saad Rubaidi, B. Ben Ammar, and M. Ben Aouicha, "Comparative Data Oversampling Techniques with Deep Learning Algorithms for Credit Card Fraud Detection," in 22nd International Conference on Intelligent Systems Design and Applications, Volume 1, Online, 2022, pp. 286–296. DOI: https://doi.org/10.1007/978-3-031-27440-4_27
J. Wei and K. Zou, "EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks," in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Hong Kong, China, 2019, pp. 6382–6388. DOI: https://doi.org/10.18653/v1/D19-1670
N. Neily, B. B. Ammar, and H. M. Kammoun, "Prediction of COVID-19 Active Cases Using Polynomial Regression and ARIMA Models," in 21st International Conference on Intelligent Systems Design and Applications, Online, 2021, pp. 1351–1362. DOI: https://doi.org/10.1007/978-3-030-96308-8_125
C. Dongmo, "Analyzing Non-Functional Requirements (NFRs) beyond Requirements Engineering," Engineering, Technology & Applied Science Research, vol. 15, no. 3, pp. 23790–23798, Jun. 2025. DOI: https://doi.org/10.48084/etasr.9800
A. Rahman, A. Nayem, and S. Siddik, "Non-Functional Requirements Classification Using Machine Learning Algorithms," International Journal of Intelligent Systems and Applications, vol. 15, no. 3, pp. 56–69, Jun. 2023. DOI: https://doi.org/10.5815/ijisa.2023.03.05
J. Cleland-Huang, S. Mazrouee, H. Liguo, and D. Port, "Nfr." Zenodo, Mar. 17, 2007.
J. Morris, E. Lifland, J. Y. Yoo, J. Grigsby, D. Jin, and Y. Qi, "TextAttack: A Framework for Adversarial Attacks, Data Augmentation, and Adversarial Training in NLP," in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Online, 2020, pp. 119–126. DOI: https://doi.org/10.18653/v1/2020.emnlp-demos.16
B. Alaya, T. Moulahi, S. E. Khediri, and S. Aladhadh, "Preserving Data Integrity and Detecting Toxic Recordings in Machine Learning using Blockchain," in 2024 IEEE 25th International Symposium on a World of Wireless, Mobile and Multimedia Networks, Perth, Australia, 2024, pp. 18–23. DOI: https://doi.org/10.1109/WoWMoM60985.2024.00015
R. Alshaya and S. E. L. khediri, "Optimizing cybercrime detection: A hybrid deep learning approach for enhanced intrusion detection systems," Peer-to-Peer Networking and Applications, vol. 18, no. 3, Apr. 2025, Art. no. 145. DOI: https://doi.org/10.1007/s12083-025-01933-w
N. Alwasil and S. E. Khediri, "IoT Protection Against Cyber Threats Based on Blockchain and Access Control: A Comprehensive Review," International Journal of Communication Networks and Information Security, vol. 15, no. 2, pp. 273–288, Nov. 2023.
Downloads
How to Cite
License
Copyright (c) 2025 Boulbaba Ben Ammar, Noura F. Almatrafi

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.
