Clustering Commuter Behavior based on Automated Fare Collection (AFC)

Authors

  • Dwijoko Purbohadi Department of Information Technology, Muhammadiyah University of Yogyakarta, Indonesia https://orcid.org/0000-0001-8009-9001
  • Laila Marifatul Azizah Department of Information Technology, Muhammadiyah University of Yogyakarta, Indonesia https://orcid.org/0000-0002-8308-1330
  • Lilis Kurniasari Department of Electrical Engineering, Nahdlatul Ulama University of Yogyakarta, Indonesia https://orcid.org/0000-0001-7703-9104
  • Novi Diah Wulandari Department of Management, Nahdlatul Ulama University of Yogyakarta, Indonesia
  • Nurna Pratiwi Department of Management, Nahdlatul Ulama University of Yogyakarta, Indonesia
  • Puji Hastuti Department of Information Technology, State Polytechnic of Jember, Indonesia
Volume: 15 | Issue: 1 | Pages: 19831-19837 | February 2025 | https://doi.org/10.48084/etasr.8899

Abstract

This paper examines the application of the Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) method to cluster Automated Fare Collection (AFC) transaction data from train travelers in Jakarta, Bogor, Depok, Tangerang, and Bekasi (Jabodetabek) in Indonesia. To enhance the clustering process, the Uniform Manifold Approximation and Projection (UMAP) dimensionality reduction and the DenseClus library are employed. In this study, different combinations of hyperparameters are used to identify the optimal configuration for producing distinct clusters with a high concentration and noticeable distinction. The results demonstrate that the utilization of HDBSCAN on UMAP-reduced data effectively, discerning unique trip patterns and emphasizing notable disparities in travel distance, time, and length among various clusters. The UMAP intersection method showed notable efficacy in maintaining the local structure of the data, resulting in the development of distinct and meaningful clusters. In addition, categorical data were transformed into numerical formats using hashing techniques, efficiently tackling the difficulties posed by a high number of categories and assuring efficient data processing. The results reveal vital insights into the application of density-based clustering to intricate transportation data, with major implications for enhancing route planning and capacity management for Jabodetabek commuters.

Keywords:

automated fare collection, public transportation, clustering algorithms, UMAP embedding, HDBSCAN

Downloads

Download data is not yet available.

References

"Kerugian Ekonomi Akibat Macet Jabodetabek Capai Rp71,4 T." [Online]. Available: https://www.cnnindonesia.com/ekonomi/

-92-635840/kerugian-ekonomi-akibat-macet-jabodetabek-capai-rp714-t.

A. M. H. Sitorus, "Sistem Transportasi Terintegrasi di DKI Jakarta: Analisis Transformasi Berkeadilan Sosial," Jurnal Sosiologi Andalas, vol. 8, no. 1, pp. 31–41, Apr. 2022.

A. I. Wiyogo, S. Budi, and H. Toba, "Ekstraksi Perilaku Komuter Pada Commuter Line Menggunakan Rule-Based Machine Learning," Jurnal Teknik Informatika dan Sistem Informasi, vol. 9, no. 1, pp. 154–166, Apr. 2023.

J. Ning, Q. Peng, Y. Zhu, Y. Jiang, and O. A. Nielsen, "A Bi-objective optimization model for the last train timetabling problem," Journal of Rail Transport Planning & Management, vol. 23, Sep. 2022, Art. no. 100333.

Annual report 2020: Spinning the Limit to Win from Pandemic. Jakarta, Indonesia: PT KAI Commuter, 2021.

X. Guo, H. Sun, J. Wu, J. Jin, J. Zhou, and Z. Gao, "Multiperiod-based timetable optimization for metro transit networks," Transportation Research Part B: Methodological, vol. 96, pp. 46–67, Feb. 2017.

Y. Chen, Y. Zhao, and K. L. Tsui, "Clustering-based Travel Pattern Recognition in Rail Transportation System Using Automated Fare Collection Data," in 2019 Prognostics and System Health Management Conference, Qingdao, China, 2019, pp. 1–7.

J. Zhao, Q. Qu, F. Zhang, C. Xu, and S. Liu, "Spatio-Temporal Analysis of Passenger Travel Patterns in Massive Smart Card Data," IEEE Transactions on Intelligent Transportation Systems, vol. 18, no. 11, pp. 3135–3146, Nov. 2017.

J. Pei, K. Zhong, J. Li, and Z. Yu, "PAC: Partial Area Clustering for Re-Adjusting the Layout of Traffic Stations in City’s Public Transport," IEEE Transactions on Intelligent Transportation Systems, vol. 24, no. 1, pp. 1251–1260, Jan. 2023.

L. M. Kieu, A. Bhaskar, and E. Chung, "Passenger Segmentation Using Smart Card Data," IEEE Transactions on Intelligent Transportation Systems, vol. 16, no. 3, pp. 1537–1548, Jun. 2015.

T. H. Tran, T. D. Cao, and T. T. H. Tran, "HDBSCAN: Evaluating the Performance of Hierarchical Clustering for Big Data," in Soft Computing: Biomedical and Related Applications, N. H. Phuong and V. Kreinovich, Eds. Cham, Switzerland: Springer International Publishing, 2021, pp. 273–283.

G. Stewart and M. Al-Khassaweneh, "An Implementation of the HDBSCAN* Clustering Algorithm," Applied Sciences, vol. 12, no. 5, Mar. 2022, Art. no. 2405.

L. Wang, P. Chen, L. Chen, and J. Mou, "Ship AIS Trajectory Clustering: An HDBSCAN-Based Approach," Journal of Marine Science and Engineering, vol. 9, no. 6, Jun. 2021, Art. no. 566.

X. Guo, D. Z. W. Wang, J. Wu, H. Sun, and L. Zhou, "Mining commuting behavior of urban rail transit network by using association rules," Physica A: Statistical Mechanics and its Applications, vol. 559, Dec. 2020, Art. no. 125094.

K. Lu, J. Liu, X. Zhou, and B. Han, "A Review of Big Data Applications in Urban Transit Systems," IEEE Transactions on Intelligent Transportation Systems, vol. 22, no. 5, pp. 2535–2552, May 2021.

X. Wu, H. Dong, S. Gao, W. Li, and Q. Zhang, "Extracting Metro Passengers’ Route Choice via AFC Data Utilizing Gaussian Mixture Clustering," in 2018 21st International Conference on Intelligent Transportation Systems, Maui, HI, USA, 2018, pp. 1933–1938.

Y. Sun and R. Xu, "Rail Transit Travel Time Reliability and Estimation of Passenger Route Choice Behavior: Analysis Using Automatic Fare Collection Data," Transportation Research Record, vol. 2275, no. 1, pp. 58–67, Jan. 2012.

Jiang Zhibin and Liao Shenmeihui, "A Method for Extracting Passenger Flow Time Series Feature of Urban Rail Transit," in ICTE 2019, X. Liu, Q. Peng, and K. C. P. Wang, Eds. Reston, VA, USA: American Society of Civil Engineers, 2020, pp. 861–869.

Z. Chen and W. Fan, "Extracting Bus Transit Boarding and Alighting Information Using Smart Card Transaction Data," Journal of Public Transportation, vol. 22, no. 1, pp. 40–56, Jan. 2020.

D. J. I. Raj, V. S. Radhakrishnan, M. R. Reddy, N. S. Selvan, B. Elangovan, and M. Ganesan, "The Projection-Based Data Transformation Approach for Privacy Preservation in Data Mining," Engineering, Technology & Applied Science Research, vol. 14, no. 4, pp. 15969–15974, Aug. 2024.

"Basic UMAP Parameters — umap 0.5 documentation." [Online]. Available: https://umap-learn.readthedocs.io/en/latest/parameters.

html.

C. J. Nolet et al., "Bringing UMAP Closer to the Speed of Light with GPU Acceleration," Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 1, pp. 418–426, May 2021.

B. Ghojogh, A. Ghodsi, F. Karray, and M. Crowley, "Uniform Manifold Approximation and Projection (UMAP) and its Variants: Tutorial and Survey." arXiv, Aug. 25, 2021.

Downloads

How to Cite

[1]
Purbohadi, D., Marifatul Azizah, L., Kurniasari, L., Diah Wulandari, N., Pratiwi , N. and Hastuti, P. 2025. Clustering Commuter Behavior based on Automated Fare Collection (AFC). Engineering, Technology & Applied Science Research. 15, 1 (Feb. 2025), 19831–19837. DOI:https://doi.org/10.48084/etasr.8899.

Metrics

Abstract Views: 29
PDF Downloads: 19

Metrics Information