A New Approach for Optimizing the Extraction of Association Rules

Bilal Bouaita; Abdesselem Beghriche; Akram Kout; Abdelouahab Moussaoui

doi:10.48084/etasr.5722

Authors

Bilal Bouaita Ferhat Abbas Setif 1 University, Algeria
Abdesselem Beghriche Ferhat Abbas Setif 1 University, Algeria
Akram Kout MISC Laboratory, Ferhat Abbas Setif 1 University, Algeria
Abdelouahab Moussaoui Ferhat Abbas Setif 1 University, Algeria

Volume: 13 | Issue: 2 | Pages: 10496-10500 | April 2023 | https://doi.org/10.48084/etasr.5722

Received: 28 January 2023 | Revised: 19 February 2023 | Accepted: 23 February 2023 | Online: 8 March 2023

Corresponding author: Bilal Bouaita

Abstract

Association rule methods are among the most used approaches for Knowledge Discovery in Databases (KDD), as they allow discovering and extracting hidden meaningful relationships between attributes or items in large datasets in the form of rules. Algorithms to extract these rules require considerable time and large memory spaces. This paper presents an algorithm that decomposes this complex problem into subproblems and processes items by category according to their support. Very frequent items and fairly frequent items are studied together. To evaluate the performance of the proposed algorithm, it was compared with Eclat and LCMFreq on two actual transactional databases. The experimental results showed that the proposed algorithm was faster in execution time and demonstrated its efficiency in memory consumption.

Keywords:

KDD, association rules, frequent itemset, data mining

References

A. Alqahtani, H. Alhakami, T. Alsubait, and A. Baz, "A Survey of Text Matching Techniques," Engineering, Technology & Applied Science Research, vol. 11, no. 1, pp. 6656–6661, Feb. 2021. DOI: https://doi.org/10.48084/etasr.3968

R. Agrawal, T. Imieliński, and A. Swami, "Mining association rules between sets of items in large databases," in Proceedings of the 1993 ACM SIGMOD international conference on Management of data, New York, NY, USA, Mar. 1993, pp. 207–216. DOI: https://doi.org/10.1145/170036.170072

S. Chakraborty, S. H. Islam, and D. Samanta, "Introduction to Data Mining and Knowledge Discovery," in Data Classification and Incremental Clustering in Data Mining and Machine Learning, S. Chakraborty, S. H. Islam, and D. Samanta, Eds. Cham, Switzerland: Springer International Publishing, 2022, pp. 1–22. DOI: https://doi.org/10.1007/978-3-030-93088-2_1

H. Alizadeh and B. M. Bidgoli, "Introducing A Hybrid Data Mining Model to Evaluate Customer Loyalty," Engineering, Technology & Applied Science Research, vol. 6, no. 6, pp. 1235–1240, Dec. 2016. DOI: https://doi.org/10.48084/etasr.741

C. Kenneth and O. Chinecherem, "Knowledge Discovery in Databases (KDD): An Overview," International Journal of Computer Science and Information Security, vol. 15, no. 12, pp. 13–16, Dec. 2017.

U. Fayyad, G. Piatetsky-Shapiro, and P. Smyth, "Knowledge discovery and data mining: towards a unifying framework," in Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA, May 1996, pp. 82–88.

B. Bouaita, A. Moussaoui, and N. E. I. Bachari, "Rainfall estimation from MSG images using fuzzy association rules," Journal of Intelligent & Fuzzy Systems, vol. 37, no. 1, pp. 1357–1369, Jan. 2019. DOI: https://doi.org/10.3233/JIFS-182786

N. Benmoussa, M. F. Amr, S. Ahriz, K. Mansouri, and E. Illoussamen, "Outlining a Model of an Intelligent Decision Support System Based on Multi Agents," Engineering, Technology & Applied Science Research, vol. 8, no. 3, pp. 2937–2942, Jun. 2018. DOI: https://doi.org/10.48084/etasr.1936

H. Li and P. C.-Y. Sheu, "A scalable association rule learning heuristic for large datasets," Journal of Big Data, vol. 8, no. 1, Jun. 2021, Art. No. 86. DOI: https://doi.org/10.1186/s40537-021-00473-3

K. Fujioka and K. Shirahama, "Generic Itemset Mining Based on Reinforcement Learning," IEEE Access, vol. 10, pp. 5824–5841, 2022. DOI: https://doi.org/10.1109/ACCESS.2022.3141806

R. Agrawal, R. Srikant, H. Road, and S. Jose, "Fast Algorithms for Mining Association Rules," in Proceedings of the 20th International Conference on Very Large Data Bases, 487-499, 1994.

A. Ceglar and J. F. Roddick, "Association mining," ACM Computing Surveys, vol. 38, no. 2, Apr. 2006. DOI: https://doi.org/10.1145/1132956.1132958

J. S. Park, M. S. Chen, and P. S. Yu, "An effective hash-based algorithm for mining association rules," ACM SIGMOD Record, vol. 24, no. 2, pp. 175–186, Feb. 1995. DOI: https://doi.org/10.1145/568271.223813

S. Brin, R. Motwani, J. D. Ullman, and S. Tsur, "Dynamic itemset counting and implication rules for market basket data," in Proceedings of the 1997 ACM SIGMOD international conference on Management of data, New York, NY, USA, Mar. 1997, pp. 255–264. DOI: https://doi.org/10.1145/253262.253325

M. J. Zaki, "Scalable algorithms for association mining," IEEE Transactions on Knowledge and Data Engineering, vol. 12, no. 3, pp. 372–390, Feb. 2000. DOI: https://doi.org/10.1109/69.846291

G. Gardarin, P. Pucheral, and F. Wu, "Bitmap based algorithms for mining association rules," presented at the 14ème Journées Bases de Données Avancées, Hammamet, Tunis, 1998.

J. Han, J. Pei, and Y. Yin, "Mining frequent patterns without candidate generation," ACM SIGMOD Record, vol. 29, no. 2, pp. 1–12, Feb. 2000. DOI: https://doi.org/10.1145/335191.335372

J. Pei, J. Han, H. Lu†, S. Nishio, S. Tang, and D. Yang, "H-Mine: Fast and space-preserving frequent pattern mining in large databases," IIE Transactions, vol. 39, no. 6, pp. 593–605, Mar. 2007. DOI: https://doi.org/10.1080/07408170600897460

G. Liu, H. Lu, W. Lou, Y. Xu, and J. X. Yu, "Efficient Mining of Frequent Patterns Using Ascending Frequency Ordered Prefix-Tree," Data Mining and Knowledge Discovery, vol. 9, no. 2, pp. 249–274, Nov. 2004.

G. Grahne and J. Zhu, "Fast algorithms for frequent itemset mining using FP-trees," IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 10, pp. 1347–1362, Jul. 2005. DOI: https://doi.org/10.1109/TKDE.2005.166

T. Uno, M. Kiyomi, and H. Arimura, "LCM ver. 2: Eﬃcient Mining Algorithms for Frequent/Closed/Maximal Itemsets," presented at the The Fourth IEEE International Conference on Data Mining (ICDM '04), Brighton, UK, Nov. 2004. DOI: https://doi.org/10.1145/1133905.1133916

Z. Deng, Z. Wang, and J. Jiang, "A new algorithm for fast mining frequent itemsets using N-lists," Science China Information Sciences, vol. 55, no. 9, pp. 2008–2030, Sep. 2012. DOI: https://doi.org/10.1007/s11432-012-4638-z

Z. H. Deng and S. L. Lv, "Fast mining frequent itemsets using Nodesets," Expert Systems with Applications, vol. 41, no. 10, pp. 4505–4512, Aug. 2014. DOI: https://doi.org/10.1016/j.eswa.2014.01.025

Z. H. Deng and S. L. Lv, "PrePost+: An efficient N-lists-based algorithm for mining frequent itemsets via Children–Parent Equivalence pruning," Expert Systems with Applications, vol. 42, no. 13, pp. 5424–5432, Aug. 2015. DOI: https://doi.org/10.1016/j.eswa.2015.03.004

Z.-H. Deng, "DiffNodesets: An efficient structure for fast mining frequent itemsets," Applied Soft Computing, vol. 41, pp. 214–223, Apr. 2016. DOI: https://doi.org/10.1016/j.asoc.2016.01.010

N. Aryabarzan, B. Minaei-Bidgoli, and M. Teshnehlab, "negFIN: An efficient algorithm for fast mining frequent itemsets". In Expert Systems with Applications, vol. 105, pp. 129-143, Sep. 2018. DOI: https://doi.org/10.1016/j.eswa.2018.03.041

"Chess and Mushroom datasets," Frequent Itemset Mining Dataset Repository. http://fimi.uantwerpen.be/data/.