Digitizing Karachi's Decades-Old Cadastral Maps: Leveraging Unsupervised Machine Learning and GEOBIA for Digitization
Received: 18 March 2024 | Revised: 7 April 2024 | Accepted: 14 April 2024 | Online: 1 August 2024
Corresponding author: Muhammad Waqas Ahmed
Abstract
In urban planning, land-use change is paramount for ensuring sustainable urban ecosystems. Monitoring, analyzing, and quantifying land use change is crucial to making statistical inferences and predicting the economic, environmental, and societal impacts of urban expansion. Recent technologies have enabled robust monitoring, recording, and documenting of spatio-temporal trends. When historical data remain nondigital, integrating modern technologies with traditional paper-based town maps becomes invaluable for digitization. Despite significant efforts in this field, little exploration has been done of the potential of Geographic Object-Based Image Analysis (GOBIA) for digitizing paper-based cadastral maps. This study introduces an innovative approach using unsupervised learning algorithms, K-means and Gaussian Mixture Models (GMM), in conjunction with GEOBIA techniques, to accurately extract land parcels from decades-old cadastral maps of Karachi, Pakistan. Initially, the maps were georeferenced using ArcGIS software, and unsupervised machine-learning algorithms were applied to preprocessed scanned images. Both clustering algorithms were evaluated based on key performance metrics, such as precision, recall, and F1 scores. The experimental results indicated that both algorithms performed well, with GMM slightly outperforming K-means in all aspects. GMM achieved 0.87 precision and recall and 0.86 F1 score of 0.86, while K-means achieved 0.82 precision, 0.78 recall, and 0.78 F1 score. Finally, unwanted features were removed by implementing a geometric criterion based on feature size and shape. This methodology effectively distinguishes between adjoining land parcels and ensures precise extraction of cadastral boundaries and land parcels, providing a reliable foundation for urban research and modeling.
Keywords:
Feature Extraction, Digital Cadastre, Historical Maps, Geographical Information SystemsDownloads
References
L. Li and Y. Liu, "Spatial-Temporal Patterns and Driving Forces of Sustainable Urbanization in China Since 2000," Journal of Urban Planning and Development, vol. 145, no. 4, Dec. 2019, Art. no. 05019014.
S. Angel, J. Parent, D. L. Civco, A. Blei, and D. Potere, "The dimensions of global urban expansion: Estimates and projections for all countries, 2000–2050," Progress in Planning, vol. 75, no. 2, pp. 53–107, Feb. 2011.
B. Rimal, L. Zhang, N. Stork, S. Sloan, and S. Rijal, "Urban Expansion Occurred at the Expense of Agricultural Lands in the Tarai Region of Nepal from 1989 to 2016," Sustainability, vol. 10, no. 5, May 2018, Art. no. 1341.
X. Guan, H. Wei, S. Lu, Q. Dai, and H. Su, "Assessment on the urbanization strategy in China: Achievements, challenges and reflections," Habitat International, vol. 71, pp. 97–109, Jan. 2018.
V. Maliene, V. Grigonis, V. Palevičius, and S. Griffiths, "Geographic information system: Old principles with new capabilities," URBAN DESIGN International, vol. 16, no. 1, pp. 1–6, Jan. 2011.
T. W. Foresman, S. T. A. Pickett, and W. C. Zipperer, "Methods for spatial and temporal land use and land cover assessment for urban ecosystems and application in the greater Baltimore-Chesapeake region," Urban Ecosystems, vol. 1, no. 4, pp. 201–216, Dec. 1997.
P. Drobež, M. K. Fras, M. Ferlan, and A. Lisec, "Transition from 2D to 3D real property cadastre: The case of the Slovenian cadastre," Computers, Environment and Urban Systems, vol. 62, pp. 125–135, Mar. 2017.
F. Döner, "Examination and comparison of mobile GIS technology for real time Geo-data acquisition in the field," Survey Review, vol. 40, no. 309, pp. 221–234, Jul. 2008.
R. Szeliski, Computer Vision: Algorithms and Applications. Springer Nature, 2022.
B. Vaienti, R. Petitpierre, I. di Lenardo, and F. Kaplan, "Machine-Learning-Enhanced Procedural Modeling for 4D Historical Cities Reconstruction," Remote Sensing, vol. 15, no. 13, Jan. 2023, Art. no. 3352.
S. Ul Din and H. W. L. Mak, "Retrieval of Land-Use/Land Cover Change (LUCC) Maps and Urban Expansion Dynamics of Hyderabad, Pakistan via Landsat Datasets and Support Vector Machine Framework," Remote Sensing, vol. 13, no. 16, Jan. 2021, Art. no. 3337.
M. W. Ahmed, S. Saadi, and M. Ahmed, "Automated road extraction using reinforced road indices for Sentinel-2 data," Array, vol. 16, Dec. 2022, Art. no. 100257.
B. Usman, "Satellite Imagery Land Cover Classification using K-Means Clustering Algorithm Computer Vision for Environmental Information Extraction," Elixir Computer Science & Engineering, vol. 63, pp. 18671–18675, 2013.
H. Xie, X. Luo, X. Xu, H. Pan, and X. Tong, "Evaluation of Landsat 8 OLI imagery for unsupervised inland water extraction," International Journal of Remote Sensing, vol. 37, no. 8, pp. 1826–1844, Apr. 2016.
L. H. Lee and T. T. Su, "Vision-Based Image Processing of Digitized Cadastral Maps," Photogrammetric Engineering & Remote Sensing, vol. 62, no. 5, pp. 553–538, May 1996.
A. Balkoca, A. İ. Yergök, and S. Yücekaya, "Vectorization of cadastral maps using image processing algorithms," in 2011 IEEE 19th Signal Processing and Communications Applications Conference (SIU), Antalya, Turkey, Apr. 2011, pp. 900–903.
M. G. Kibria and Al-Imtiaz, "BengalI Optical Character Recognition using self organizing map," in 2012 International Conference on Informatics, Electronics & Vision (ICIEV), Dhaka, Bangladesh, May 2012, pp. 764–769.
I. Schlegel, "Automated Extraction of Labels from Large-Scale Historical Maps," AGILE: GIScience Series, vol. 2, pp. 1–14, Jun. 2021.
A. Hasan, "Land contestation in Karachi and the impact on housing and urban development," Environment and Urbanization, vol. 27, no. 1, pp. 217–230, Apr. 2015.
A. Makandar and B. Halalli, "Image enhancement techniques using highpass and lowpass filters," International Journal of Computer Applications, vol. 109, no. 14, pp. 12–15, Jan. 2015.
H. Kobayashi, B. L. Mark, and W. Turin, Probability, Random Processes, and Statistical Analysis: Applications to Communications, Signal Processing, Queueing Theory and Mathematical Finance. Cambridge University Press, 2011.
M. J. Adams, Chemometrics in Analytical Spectroscopy. Royal Society of Chemistry, 2004.
J. Šťastný and M. Minařík, "A Brief Introduction to Image Pre-Processing for Object Recognition," 2007.
S. Rauschert, K. Raubenheimer, P. E. Melton, and R. C. Huang, "Machine learning and clinical epigenetics: a review of challenges for diagnosis and classification," Clinical Epigenetics, vol. 12, no. 1, Apr. 2020, Art. no. 51.
I. Syarif, A. Prugel-Bennett, and G. Wills, "Unsupervised Clustering Approach for Network Anomaly Detection," in Networked Digital Technologies, Dubai, United Arab Emirates, 2012, pp. 135–145.
F. Erdem and U. Avdan, "Comparison of Different U-Net Models for Building Extraction from High-Resolution Aerial Imagery," International Journal of Environment and Geoinformatics, vol. 7, no. 3, pp. 221–227, Dec. 2020.
S. Lloyd, "Least squares quantization in PCM," IEEE Transactions on Information Theory, vol. 28, no. 2, pp. 129–137, Mar. 1982.
X. Wang, P. Wu, Q. Xu, Z. Zeng, and Y. Xie, "Joint image clustering and feature selection with auto-adjoined learning for high-dimensional data," Knowledge-Based Systems, vol. 232, Nov. 2021, Art. no. 107443.
C. K. Reddy and B. Vinzamuri, "A Survey of Partitional and Hierarchical Clustering Algorithms," in Data Clustering, Chapman and Hall/CRC, 2014.
D. Reynolds, "Gaussian Mixture Models," in Encyclopedia of Biometrics, S. Z. Li and A. K. Jain, Eds. Boston, MA, USA: Springer US, 2015, pp. 827–832.
M. R. Gupta and Y. Chen, "Theory and Use of the EM Algorithm," Foundations and Trends® in Signal Processing, vol. 4, no. 3, pp. 223–296, Apr. 2011.
S. Misra, H. Li, and J. He, Machine Learning for Subsurface Characterization. Gulf Professional Publishing, 2019.
K. K. Verma, B. M. Singh, and A. Dixit, "A review of supervised and unsupervised machine learning techniques for suspicious behavior recognition in intelligent surveillance system," International Journal of Information Technology, vol. 14, no. 1, pp. 397–410, Feb. 2022.
D. Gutierrez-Rojas, I. T. Christou, D. Dantas, A. Narayanan, P. H. J. Nardelli, and Y. Yang, "Performance evaluation of machine learning for fault selection in power transmission lines," Knowledge and Information Systems, vol. 64, no. 3, pp. 859–883, Mar. 2022.
D. Krstinić, M. Braović, L. Šerić, and D. Božić-Štulić, "Multi-label Classifier Performance Evaluation with Confusion Matrix," in Computer Science & Information Technology, Jun. 2020, pp. 1–14.
N. Wang, N. N. Zeng, and W. Zhu, "Sensitivity, Specificity, Accuracy, Associated Confidence Interval And ROC Analysis With Practical SAS Implementations," in NESUG proceedings: health care and life sciences, Baltimore, MD, USA, 2010.
M. Bekkar and D. H. K. Djemaa, "Evaluation Measures for Models Assessment over Imbalanced Data Sets," Journal of Information Engineering and Applications, vol. 3, no. 10, pp. 27–29, 2013.
D. Zhang, J. Wang, and X. Zhao, "Estimating the Uncertainty of Average F1 Scores," in Proceedings of the 2015 International Conference on The Theory of Information Retrieval, Northampton, MA, USA, Jun. 2015, pp. 317–320.
M. K. Villareal and A. F. Tongco, "Remote Sensing Techniques for Classification and Mapping of Sugarcane Growth," Engineering, Technology & Applied Science Research, vol. 10, no. 4, pp. 6041–6046, Aug. 2020.
N. Mesner and K. Ostir, "Investigating the impact of spatial and spectral resolution of satellite images on segmentation quality," Journal of Applied Remote Sensing, vol. 8, no. 1, Jan. 2014, Art. no. 083696.
A. M. El-naggar, "Determination of optimum segmentation parameter values for extracting building from remote sensing images," Alexandria Engineering Journal, vol. 57, no. 4, pp. 3089–3097, Dec. 2018.
H. S. Kuyuk, E. Yildirim, E. Dogan, and G. Horasan, "Application of k-means and Gaussian mixture model for classification of seismic activities in Istanbul," Nonlinear Processes in Geophysics, vol. 19, no. 4, pp. 411–419, Aug. 2012.
L. A. Jeni, J. F. Cohn, and F. De La Torre, "Facing Imbalanced Data–Recommendations for the Use of Performance Metrics," in 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, Geneva, Switzerland, Sep. 2013, pp. 245–251.
D. Virmani, N. Jain, A. Srivastav, M. Mittal, and S. Mittal, "An Enhanced Binary Classifier Incorporating Weighted Scores," Engineering, Technology & Applied Science Research, vol. 8, no. 2, pp. 2853–2858, Apr. 2018.
Downloads
How to Cite
License
Copyright (c) 2024 Muhammad Waqas Ahmed, Muhammad Ahmed, Asif Ahmed Shaikh
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.