Clustering Commuter Behavior based on Automated Fare Collection (AFC)
Received: 3 September 2024 | Revised: 12 November 2024 and 18 November 2024 | Accepted: 21 November 2024 | Online: 2 February 2025
Corresponding author: Dwijoko Purbohadi
Abstract
This paper examines the application of the Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) method to cluster Automated Fare Collection (AFC) transaction data from train travelers in Jakarta, Bogor, Depok, Tangerang, and Bekasi (Jabodetabek) in Indonesia. To enhance the clustering process, the Uniform Manifold Approximation and Projection (UMAP) dimensionality reduction and the DenseClus library are employed. In this study, different combinations of hyperparameters are used to identify the optimal configuration for producing distinct clusters with a high concentration and noticeable distinction. The results demonstrate that the utilization of HDBSCAN on UMAP-reduced data effectively, discerning unique trip patterns and emphasizing notable disparities in travel distance, time, and length among various clusters. The UMAP intersection method showed notable efficacy in maintaining the local structure of the data, resulting in the development of distinct and meaningful clusters. In addition, categorical data were transformed into numerical formats using hashing techniques, efficiently tackling the difficulties posed by a high number of categories and assuring efficient data processing. The results reveal vital insights into the application of density-based clustering to intricate transportation data, with major implications for enhancing route planning and capacity management for Jabodetabek commuters.
Keywords:
automated fare collection, public transportation, clustering algorithms, UMAP embedding, HDBSCANDownloads
References
"Kerugian Ekonomi Akibat Macet Jabodetabek Capai Rp71,4 T." [Online]. Available: https://www.cnnindonesia.com/ekonomi/
-92-635840/kerugian-ekonomi-akibat-macet-jabodetabek-capai-rp714-t.
A. M. H. Sitorus, "Sistem Transportasi Terintegrasi di DKI Jakarta: Analisis Transformasi Berkeadilan Sosial," Jurnal Sosiologi Andalas, vol. 8, no. 1, pp. 31–41, Apr. 2022.
A. I. Wiyogo, S. Budi, and H. Toba, "Ekstraksi Perilaku Komuter Pada Commuter Line Menggunakan Rule-Based Machine Learning," Jurnal Teknik Informatika dan Sistem Informasi, vol. 9, no. 1, pp. 154–166, Apr. 2023.
J. Ning, Q. Peng, Y. Zhu, Y. Jiang, and O. A. Nielsen, "A Bi-objective optimization model for the last train timetabling problem," Journal of Rail Transport Planning & Management, vol. 23, Sep. 2022, Art. no. 100333.
Annual report 2020: Spinning the Limit to Win from Pandemic. Jakarta, Indonesia: PT KAI Commuter, 2021.
X. Guo, H. Sun, J. Wu, J. Jin, J. Zhou, and Z. Gao, "Multiperiod-based timetable optimization for metro transit networks," Transportation Research Part B: Methodological, vol. 96, pp. 46–67, Feb. 2017.
Y. Chen, Y. Zhao, and K. L. Tsui, "Clustering-based Travel Pattern Recognition in Rail Transportation System Using Automated Fare Collection Data," in 2019 Prognostics and System Health Management Conference, Qingdao, China, 2019, pp. 1–7.
J. Zhao, Q. Qu, F. Zhang, C. Xu, and S. Liu, "Spatio-Temporal Analysis of Passenger Travel Patterns in Massive Smart Card Data," IEEE Transactions on Intelligent Transportation Systems, vol. 18, no. 11, pp. 3135–3146, Nov. 2017.
J. Pei, K. Zhong, J. Li, and Z. Yu, "PAC: Partial Area Clustering for Re-Adjusting the Layout of Traffic Stations in City’s Public Transport," IEEE Transactions on Intelligent Transportation Systems, vol. 24, no. 1, pp. 1251–1260, Jan. 2023.
L. M. Kieu, A. Bhaskar, and E. Chung, "Passenger Segmentation Using Smart Card Data," IEEE Transactions on Intelligent Transportation Systems, vol. 16, no. 3, pp. 1537–1548, Jun. 2015.
T. H. Tran, T. D. Cao, and T. T. H. Tran, "HDBSCAN: Evaluating the Performance of Hierarchical Clustering for Big Data," in Soft Computing: Biomedical and Related Applications, N. H. Phuong and V. Kreinovich, Eds. Cham, Switzerland: Springer International Publishing, 2021, pp. 273–283.
G. Stewart and M. Al-Khassaweneh, "An Implementation of the HDBSCAN* Clustering Algorithm," Applied Sciences, vol. 12, no. 5, Mar. 2022, Art. no. 2405.
L. Wang, P. Chen, L. Chen, and J. Mou, "Ship AIS Trajectory Clustering: An HDBSCAN-Based Approach," Journal of Marine Science and Engineering, vol. 9, no. 6, Jun. 2021, Art. no. 566.
X. Guo, D. Z. W. Wang, J. Wu, H. Sun, and L. Zhou, "Mining commuting behavior of urban rail transit network by using association rules," Physica A: Statistical Mechanics and its Applications, vol. 559, Dec. 2020, Art. no. 125094.
K. Lu, J. Liu, X. Zhou, and B. Han, "A Review of Big Data Applications in Urban Transit Systems," IEEE Transactions on Intelligent Transportation Systems, vol. 22, no. 5, pp. 2535–2552, May 2021.
X. Wu, H. Dong, S. Gao, W. Li, and Q. Zhang, "Extracting Metro Passengers’ Route Choice via AFC Data Utilizing Gaussian Mixture Clustering," in 2018 21st International Conference on Intelligent Transportation Systems, Maui, HI, USA, 2018, pp. 1933–1938.
Y. Sun and R. Xu, "Rail Transit Travel Time Reliability and Estimation of Passenger Route Choice Behavior: Analysis Using Automatic Fare Collection Data," Transportation Research Record, vol. 2275, no. 1, pp. 58–67, Jan. 2012.
Jiang Zhibin and Liao Shenmeihui, "A Method for Extracting Passenger Flow Time Series Feature of Urban Rail Transit," in ICTE 2019, X. Liu, Q. Peng, and K. C. P. Wang, Eds. Reston, VA, USA: American Society of Civil Engineers, 2020, pp. 861–869.
Z. Chen and W. Fan, "Extracting Bus Transit Boarding and Alighting Information Using Smart Card Transaction Data," Journal of Public Transportation, vol. 22, no. 1, pp. 40–56, Jan. 2020.
D. J. I. Raj, V. S. Radhakrishnan, M. R. Reddy, N. S. Selvan, B. Elangovan, and M. Ganesan, "The Projection-Based Data Transformation Approach for Privacy Preservation in Data Mining," Engineering, Technology & Applied Science Research, vol. 14, no. 4, pp. 15969–15974, Aug. 2024.
"Basic UMAP Parameters — umap 0.5 documentation." [Online]. Available: https://umap-learn.readthedocs.io/en/latest/parameters.
html.
C. J. Nolet et al., "Bringing UMAP Closer to the Speed of Light with GPU Acceleration," Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 1, pp. 418–426, May 2021.
B. Ghojogh, A. Ghodsi, F. Karray, and M. Crowley, "Uniform Manifold Approximation and Projection (UMAP) and its Variants: Tutorial and Survey." arXiv, Aug. 25, 2021.
Downloads
How to Cite
License
Copyright (c) 2024 Dwijoko Purbohadi, Laila Marifatul Azizah, Lilis Kurniasari, Novi Diah Wulandari, Nurna Pratiwi , Puji Hastuti

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.