A Hybrid Machine Learning Model for Market Clustering
Received: 13 October 2024 | Revised: 31 October 2024 | Accepted: 3 November 2024 | Online: 18 November 2024
Corresponding author: Rendra Gustriansyah
Abstract
Market clustering is increasingly important for companies to understand consumer shopping behavior in the context of complex data. This study aims to develop a hybrid model that integrates Principal Component Analysis (PCA) and k-medoids to enhance market clustering based on consumer shopping patterns. The methods used include data preprocessing, PCA application for dimensionality reduction, and clustering using k-medoids. The quality of the clusters is evaluated with various validity indices. The results show that the hybrid model produces clusters with better quality compared to the single k-medoids method, as seen from the Calinski-Harabasz Index (CHI), theSilhouette Width (SW), and the Davies-Bouldin (DB) index. The implications of these findings emphasize the importance of adopting hybrid methods in marketing strategies to improve understanding of consumer behavior dynamics and allow companies to adjust their marketing strategies more effectively. This study provides a strong foundation for further development in clustering analysis across various industry sectors and highlights the potential for innovative techniques to address dynamic market challenges.
Keywords:
market clustering, principal component analysis, k-medoids, dimensionality reduction, consumer behaviorDownloads
References
K. Tabianan, S. Velu, and V. Ravi, "K-Means Clustering Approach for Intelligent Customer Segmentation Using Customer Purchase Behavior Data," Sustainability, vol. 14, no. 12, Jan. 2022, Art. no. 7243.
M. A. Rahim, M. Mushafiq, S. Khan, and Z. A. Arain, "RFM-based repurchase behavior for customer classification and segmentation," Journal of Retailing and Consumer Services, vol. 61, Jul. 2021, Art. no. 102566.
A. John, I. F. B. Isnin, S. H. H. Madni, and F. B. Muchtar, "Enhanced intrusion detection model based on principal component analysis and variable ensemble machine learning algorithm," Intelligent Systems with Applications, vol. 24, Dec. 2024, Art. no. 200442.
P. D’Urso, M. Mucciardi, E. Otranto, and V. Vitale, "Community mobility in the European regions during COVID-19 pandemic: A partitioning around medoids with noise cluster based on space–time autoregressive models," Spatial Statistics, vol. 49, Jun. 2022, Art. no. 100531.
T. Kim and J.-S. Lee, "Maximizing AUC to learn weighted naive Bayes for imbalanced data classification," Expert Systems with Applications, vol. 217, May 2023, Art. no. 119564.
J. Salminen, M. Mustak, M. Sufyan, and B. J. Jansen, "How can algorithms help in segmenting users and customers? A systematic review and research agenda for algorithmic customer segmentation," Journal of Marketing Analytics, vol. 11, no. 4, pp. 677–692, Dec. 2023.
H. Abbasimehr and A. Bahrini, "An analytical framework based on the recency, frequency, and monetary model and time series clustering techniques for dynamic segmentation," Expert Systems with Applications, vol. 192, Apr. 2022, Art. no. 116373.
A. Handojo, N. Pujawan, B. Santosa, and M. L. Singgih, "A multi layer recency frequency monetary method for customer priority segmentation in online transaction," Cogent Engineering, vol. 10, 2023, Art. no. 2162679.
S. Monalisa, Y. Juniarti, E. Saputra, F. Muttakin, and T. K. Ahsyar, "Customer segmentation with RFM models and demographic variable using DBSCAN algorithm," TELKOMNIKA (Telecommunication Computing Electronics and Control), vol. 21, no. 4, pp. 742–749, Aug. 2023.
Y. He and Y. Cheng, "Customer Segmentation and Management of Online Shops Based on RFM Model," in International Conference on Application of Intelligent Systems in Multi-modal Information Analytics, Huhehaote, China, Apr. 2021, pp. 34–41.
S. Monalisa, P. Nadya, and R. Novita, "Analysis for Customer Lifetime Value Categorization with RFM Model," Procedia Computer Science, vol. 161, pp. 834–840, Jan. 2019.
R. Gustriansyah, N. Suhandi, and F. Antony, "Clustering optimization in RFM analysis Based on k-Means," Indonesian Journal of Electrical Engineering and Computer Science, vol. 18, no. 1, pp. 470–477, Apr. 2020.
R. Gustriansyah, E. Ermatita, and D. P. Rini, "An approach for sales forecasting," Expert Systems with Applications, vol. 207, Nov. 2022, Art. no. 118043.
S. Verma, R. Sharma, S. Deb, and D. Maitra, "Artificial intelligence in marketing: Systematic review and future research direction," International Journal of Information Management Data Insights, vol. 1, no. 1, Apr. 2021, Art. no. 100002.
Y. E. Touati, J. B. Slimane, and T. Saidani, "Adaptive Method for Feature Selection in the Machine Learning Context," Engineering, Technology & Applied Science Research, vol. 14, no. 3, pp. 14295–14300, Jun. 2024.
T. Uckan, "Integrating PCA with deep learning models for stock market Forecasting: An analysis of Turkish stocks markets," Journal of King Saud University - Computer and Information Sciences, vol. 36, no. 8, Oct. 2024, Art. no. 102162.
D. Festa et al., "Unsupervised detection of InSAR time series patterns based on PCA and K-means clustering," International Journal of Applied Earth Observation and Geoinformation, vol. 118, Apr. 2023, Art. no. 103276.
Y. Sun, H. Liu, and Y. Gao, "Research on customer lifetime value based on machine learning algorithms and customer relationship management analysis model," Heliyon, vol. 9, no. 2, Feb. 2023, Art. no. e13384.
M. Riza, K. B. Seminar, and A. Maulana, "Pembentukan Target Pasar Berdasarkan Data Stream Transaksi Kartu Kredit (Clustering dan Association Rule) pada PT Bank Bukopin," Jurnal Aplikasi Bisnis dan Manajemen, vol. 4, no. 1, pp. 86–86, Jan. 2018.
Z.-J. Lee, C.-Y. Lee, L.-Y. Chang, and N. Sano, "Clustering and Classification Based on Distributed Automatic Feature Engineering for Customer Segmentation," Symmetry, vol. 13, no. 9, Sep. 2021, Art. no. 1557.
J. Zhang, P. Lin, and A. Simeone, "Information mining of customers preferences for product specifications determination using big sales data," Procedia CIRP, vol. 109, pp. 101–106, Jan. 2022.
C. Wang, "Efficient customer segmentation in digital marketing using deep learning with swarm intelligence approach," Information Processing & Management, vol. 59, no. 6, Nov. 2022, Art. no. 103085.
Y. Li, X. Chu, D. Tian, J. Feng, and W. Mu, "Customer segmentation using K-means clustering and the adaptive particle swarm optimization algorithm," Applied Soft Computing, vol. 113, Dec. 2021, Art. no. 107924.
A. Kassambara and F. Mundt, "factoextra: Extract and Visualize the Results of Multivariate Data Analyses." Apr. 01, 2020, [Online]. Available: https://cran.r-project.org/web/packages/factoextra/index.html.
M. Charrad, N. Ghazzali, V. Boiteau, and A. Niknafs, "NbClust: An R Package for Determining the Relevant Number of Clusters in a Data Set," Journal of Statistical Software, vol. 61, pp. 1–36, Nov. 2014.
E. Schubert and P. J. Rousseeuw, "Fast and eager k-medoids clustering: O (k) runtime improvement of the PAM, CLARA, and CLARANS algorithms," Information Systems, vol. 101, Nov. 2021, Art. no. 101804.
P. J. Rousseeuw, "Silhouettes: A graphical aid to the interpretation and validation of cluster analysis," Journal of Computational and Applied Mathematics, vol. 20, pp. 53–65, Nov. 1987.
R. Gustriansyah, J. Alie, A. Sanmorino, R. Heriansyah, and M. N. M. M. Noor, "Machine Learning for Clustering Regencies-Cities Based on Inflation and Poverty Rates in Indonesia," Indonesian Journal of Information Systems, vol. 5, no. 1, pp. 64–73, Aug. 2022.
D. L. Davies and D. W. Bouldin, "A Cluster Separation Measure," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-1, no. 2, pp. 224–227, Apr. 1979.
Downloads
How to Cite
License
Copyright (c) 2024 Rendra Gustriansyah, Juhaini Alie, Nazori Suhandi
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.