A New Approach for Optimizing the Extraction of Association Rules
Received: 28 January 2023 | Revised: 19 February 2023 | Accepted: 23 February 2023 | Online: 8 March 2023
Corresponding author: Bilal Bouaita
Abstract
Association rule methods are among the most used approaches for Knowledge Discovery in Databases (KDD), as they allow discovering and extracting hidden meaningful relationships between attributes or items in large datasets in the form of rules. Algorithms to extract these rules require considerable time and large memory spaces. This paper presents an algorithm that decomposes this complex problem into subproblems and processes items by category according to their support. Very frequent items and fairly frequent items are studied together. To evaluate the performance of the proposed algorithm, it was compared with Eclat and LCMFreq on two actual transactional databases. The experimental results showed that the proposed algorithm was faster in execution time and demonstrated its efficiency in memory consumption.
Keywords:
KDD, association rules, frequent itemset, data miningDownloads
References
A. Alqahtani, H. Alhakami, T. Alsubait, and A. Baz, "A Survey of Text Matching Techniques," Engineering, Technology & Applied Science Research, vol. 11, no. 1, pp. 6656–6661, Feb. 2021. DOI: https://doi.org/10.48084/etasr.3968
R. Agrawal, T. Imieliński, and A. Swami, "Mining association rules between sets of items in large databases," in Proceedings of the 1993 ACM SIGMOD international conference on Management of data, New York, NY, USA, Mar. 1993, pp. 207–216. DOI: https://doi.org/10.1145/170036.170072
S. Chakraborty, S. H. Islam, and D. Samanta, "Introduction to Data Mining and Knowledge Discovery," in Data Classification and Incremental Clustering in Data Mining and Machine Learning, S. Chakraborty, S. H. Islam, and D. Samanta, Eds. Cham, Switzerland: Springer International Publishing, 2022, pp. 1–22. DOI: https://doi.org/10.1007/978-3-030-93088-2_1
H. Alizadeh and B. M. Bidgoli, "Introducing A Hybrid Data Mining Model to Evaluate Customer Loyalty," Engineering, Technology & Applied Science Research, vol. 6, no. 6, pp. 1235–1240, Dec. 2016. DOI: https://doi.org/10.48084/etasr.741
C. Kenneth and O. Chinecherem, "Knowledge Discovery in Databases (KDD): An Overview," International Journal of Computer Science and Information Security, vol. 15, no. 12, pp. 13–16, Dec. 2017.
U. Fayyad, G. Piatetsky-Shapiro, and P. Smyth, "Knowledge discovery and data mining: towards a unifying framework," in Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA, May 1996, pp. 82–88.
B. Bouaita, A. Moussaoui, and N. E. I. Bachari, "Rainfall estimation from MSG images using fuzzy association rules," Journal of Intelligent & Fuzzy Systems, vol. 37, no. 1, pp. 1357–1369, Jan. 2019. DOI: https://doi.org/10.3233/JIFS-182786
N. Benmoussa, M. F. Amr, S. Ahriz, K. Mansouri, and E. Illoussamen, "Outlining a Model of an Intelligent Decision Support System Based on Multi Agents," Engineering, Technology & Applied Science Research, vol. 8, no. 3, pp. 2937–2942, Jun. 2018. DOI: https://doi.org/10.48084/etasr.1936
H. Li and P. C.-Y. Sheu, "A scalable association rule learning heuristic for large datasets," Journal of Big Data, vol. 8, no. 1, Jun. 2021, Art. No. 86. DOI: https://doi.org/10.1186/s40537-021-00473-3
K. Fujioka and K. Shirahama, "Generic Itemset Mining Based on Reinforcement Learning," IEEE Access, vol. 10, pp. 5824–5841, 2022. DOI: https://doi.org/10.1109/ACCESS.2022.3141806
R. Agrawal, R. Srikant, H. Road, and S. Jose, "Fast Algorithms for Mining Association Rules," in Proceedings of the 20th International Conference on Very Large Data Bases, 487-499, 1994.
A. Ceglar and J. F. Roddick, "Association mining," ACM Computing Surveys, vol. 38, no. 2, Apr. 2006. DOI: https://doi.org/10.1145/1132956.1132958
J. S. Park, M. S. Chen, and P. S. Yu, "An effective hash-based algorithm for mining association rules," ACM SIGMOD Record, vol. 24, no. 2, pp. 175–186, Feb. 1995. DOI: https://doi.org/10.1145/568271.223813
S. Brin, R. Motwani, J. D. Ullman, and S. Tsur, "Dynamic itemset counting and implication rules for market basket data," in Proceedings of the 1997 ACM SIGMOD international conference on Management of data, New York, NY, USA, Mar. 1997, pp. 255–264. DOI: https://doi.org/10.1145/253262.253325
M. J. Zaki, "Scalable algorithms for association mining," IEEE Transactions on Knowledge and Data Engineering, vol. 12, no. 3, pp. 372–390, Feb. 2000. DOI: https://doi.org/10.1109/69.846291
G. Gardarin, P. Pucheral, and F. Wu, "Bitmap based algorithms for mining association rules," presented at the 14ème Journées Bases de Données Avancées, Hammamet, Tunis, 1998.
J. Han, J. Pei, and Y. Yin, "Mining frequent patterns without candidate generation," ACM SIGMOD Record, vol. 29, no. 2, pp. 1–12, Feb. 2000. DOI: https://doi.org/10.1145/335191.335372
J. Pei, J. Han, H. Lu†, S. Nishio, S. Tang, and D. Yang, "H-Mine: Fast and space-preserving frequent pattern mining in large databases," IIE Transactions, vol. 39, no. 6, pp. 593–605, Mar. 2007. DOI: https://doi.org/10.1080/07408170600897460
G. Liu, H. Lu, W. Lou, Y. Xu, and J. X. Yu, "Efficient Mining of Frequent Patterns Using Ascending Frequency Ordered Prefix-Tree," Data Mining and Knowledge Discovery, vol. 9, no. 2, pp. 249–274, Nov. 2004.
G. Grahne and J. Zhu, "Fast algorithms for frequent itemset mining using FP-trees," IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 10, pp. 1347–1362, Jul. 2005. DOI: https://doi.org/10.1109/TKDE.2005.166
T. Uno, M. Kiyomi, and H. Arimura, "LCM ver. 2: Efficient Mining Algorithms for Frequent/Closed/Maximal Itemsets," presented at the The Fourth IEEE International Conference on Data Mining (ICDM '04), Brighton, UK, Nov. 2004. DOI: https://doi.org/10.1145/1133905.1133916
Z. Deng, Z. Wang, and J. Jiang, "A new algorithm for fast mining frequent itemsets using N-lists," Science China Information Sciences, vol. 55, no. 9, pp. 2008–2030, Sep. 2012. DOI: https://doi.org/10.1007/s11432-012-4638-z
Z. H. Deng and S. L. Lv, "Fast mining frequent itemsets using Nodesets," Expert Systems with Applications, vol. 41, no. 10, pp. 4505–4512, Aug. 2014. DOI: https://doi.org/10.1016/j.eswa.2014.01.025
Z. H. Deng and S. L. Lv, "PrePost+: An efficient N-lists-based algorithm for mining frequent itemsets via Children–Parent Equivalence pruning," Expert Systems with Applications, vol. 42, no. 13, pp. 5424–5432, Aug. 2015. DOI: https://doi.org/10.1016/j.eswa.2015.03.004
Z.-H. Deng, "DiffNodesets: An efficient structure for fast mining frequent itemsets," Applied Soft Computing, vol. 41, pp. 214–223, Apr. 2016. DOI: https://doi.org/10.1016/j.asoc.2016.01.010
N. Aryabarzan, B. Minaei-Bidgoli, and M. Teshnehlab, "negFIN: An efficient algorithm for fast mining frequent itemsets". In Expert Systems with Applications, vol. 105, pp. 129-143, Sep. 2018. DOI: https://doi.org/10.1016/j.eswa.2018.03.041
"Chess and Mushroom datasets," Frequent Itemset Mining Dataset Repository. http://fimi.uantwerpen.be/data/.
Downloads
How to Cite
License
Copyright (c) 2023 Bilal Bouaita, Abdesselem Beghriche, Akram Kout, Abdelouahab Moussaoui
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.