Prediction of Agricultural Commodity Prices using Big Data Framework
Received: 4 October 2023 | Revised: 30 October 2023 | Accepted: 4 November 2023 | Online: 8 February 2024
Corresponding author: Humaira Rana
Abstract
The agriculture sector plays a crucial role in the economy of Pakistan, contributing significantly to the Gross Domestic Product (GDP) and the employment rate. However, this sector faces challenges such as climate change, water scarcity, and low productivity, which have a direct impact on agricultural commodity prices. Accurate forecasting of commodity prices is essential for farmers, traders, and policymakers to make informed decisions and improve economic outcomes. This paper explores the use of a big data framework for agricultural commodity price forecasting in Pakistan, using a historical dataset on commodity prices in various Pakistani cities from 2007 to 2022 and Apache Spark to preprocess and clean the data. Based on historical spinach prices in Vehari City, the machine learning models Auto-Regressive Moving Average (ARIMA), Random Forest, and Long-Short-Term Memory (LSTM) were applied to price trends, and their performance was compared using Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE) and squared correlation coefficient (R2). LSTM outperformed ARIMA and Random Forest with a higher R2 value of 0.8 and the lowest MAE of 125.29. Such predictions can help farmers to effectively plan crop cultivation and traders to make well-informed decisions.
Keywords:
Pyspark, agricultural commodity, price forecasting, big data analytics, Apache Spark frameworkDownloads
References
Y. H. Gu, D. Jin, H. Yin, R. Zheng, X. Piao, and S. J. Yoo, "Forecasting Agricultural Commodity Prices Using Dual Input Attention LSTM," Agriculture, vol. 12, no. 2, Feb. 2022, Art. no. 256. DOI: https://doi.org/10.3390/agriculture12020256
"The impact of higher agricultural commodity prices on emerging and low-income countries," CaixaBank Research, Dec. 22, 2022. https://www.caixabankresearch.com/en/economics-markets/financial-markets/impact-higher-agricultural-commodity-prices-emerging-and-low.
"Agriculture Statistics", Pakistan Bureau of Statistics, https://www.pbs.gov.pk/content/agriculture-statistics.
"Pakistan Arable Land 1961-2023." https://www.macrotrends.net/countries/PAK/pakistan/arable-land.
S. Škrbić et al., "Analysis of Plant-Production-Obtained Biomass in Function of Sustainable Energy," Sustainability, vol. 12, no. 13, Jan. 2020, Art. no. 5486. DOI: https://doi.org/10.3390/su12135486
U. Ali et al., "Climate change impacts on agriculture sector: A case study of Pakistan," Ciência Rural, vol. 51, Apr. 2021, Art. no. e20200110. DOI: https://doi.org/10.1590/0103-8478cr20200110
N. C. Eli-Chukwu, "Applications of Artificial Intelligence in Agriculture: A Review," Engineering, Technology & Applied Science Research, vol. 9, no. 4, pp. 4377–4383, Aug. 2019. DOI: https://doi.org/10.48084/etasr.2756
A. Rasheed, M. S. Younis, F. Ahmad, J. Qadir, and M. Kashif, "District Wise Price Forecasting of Wheat in Pakistan using Deep Learning." arXiv, Mar. 05, 2021.
A. Qaiser, M. U. Farooq, S. M. N. Mustafa, and N. Abrar, "Comparative Analysis of ETL Tools in Big Data Analytics," Pakistan Journal of Engineering and Technology, vol. 6, no. 1, pp. 7–12, Jan. 2023. DOI: https://doi.org/10.51846/vol6iss1pp7-12
S. M. Nabeel Mustafa, M. Umer Farooque, M. Tahir, S. M. Khan, and R. Qamar, "Frameworks, Applications and Challenges in Streaming Big Data Analytics: A Review," in 2022 3rd International Conference on Innovations in Computer Science & Software Engineering (ICONICS), Karachi, Pakistan, Sep. 2022, pp. 1–6. DOI: https://doi.org/10.1109/ICONICS56716.2022.10100410
"Apache SparkTM - Unified Engine for large-scale data analytics." https://spark.apache.org/.
A. Mishra, "Machine learning classification models for detection of the fracture location in dissimilar friction stir welded joint," Applied Engineering Letters, 2020. DOI: https://doi.org/10.26434/chemrxiv-2021-cn2z8
A. Rehman, Z. Deyuan, I. Hussain, M. S. Iqbal, Y. Yang, and L. Jingdong, "Prediction of Major Agricultural Fruits Production in Pakistan by Using an Econometric Analysis and Machine Learning Technique," International Journal of Fruit Science, vol. 18, no. 4, pp. 445–461, Oct. 2018. DOI: https://doi.org/10.1080/15538362.2018.1485536
N. Khan et al., "Prediction of Oil Palm Yield Using Machine Learning in the Perspective of Fluctuating Weather and Soil Moisture Conditions: Evaluation of a Generic Workflow," Plants, vol. 11, no. 13, Jan. 2022, Art. no. 1697. DOI: https://doi.org/10.3390/plants11131697
B. Wang et al., "Research on Hybrid Model of Garlic Short-term Price Forecasting based on Big Data," Computers, Materials & Continua, vol. 57, no. 2, pp. 283–296, 2018. DOI: https://doi.org/10.32604/cmc.2018.03791
S. Akshay Prassanna et al., "Crop value forecasting using decision tree regressor and models," European Journal of Molecular & Clinical Medicine, vol. 7, no. 2, 2020.
A. Vohra, N. Pandey, and S. K. Khatri, "Decision Making Support System for Prediction of Prices in Agricultural Commodity," in 2019 Amity International Conference on Artificial Intelligence (AICAI), Dubai, United Arab Emirates, Oct. 2019, pp. 345–348. DOI: https://doi.org/10.1109/AICAI.2019.8701273
S. Bayona-Oré, R. Cerna, and E. Tirado Hinojoza, "Machine Learning for Price Prediction for Agricultural Products," WSEAS Transactions on Business and Economics, vol. 18, pp. 969–977, Jun. 2021. DOI: https://doi.org/10.37394/23207.2021.18.92
S. A. Haider et al., "LSTM Neural Network Based Forecasting Model for Wheat Production in Pakistan," Agronomy, vol. 9, no. 2, Feb. 2019, Art. no. 72. DOI: https://doi.org/10.3390/agronomy9020072
Y. Su and X. Wang, "Innovation of agricultural economic management in the process of constructing smart agriculture by big data," Sustainable Computing: Informatics and Systems, vol. 31, Sep. 2021, Art. no. 100579. DOI: https://doi.org/10.1016/j.suscom.2021.100579
"AMIS Agriculture Marketing Wing Punjab." http://www.amis.pk/.
S. K. Filipova-Petrakieva and V. Dochev, "Short-Term Forecasting of Hourly Electricity Power Demand: Reggresion and Cluster Methods for Short-Term Prognosis," Engineering, Technology & Applied Science Research, vol. 12, no. 2, pp. 8374–8381, Apr. 2022. DOI: https://doi.org/10.48084/etasr.4787
S. Joseph, N. Mduma, and D. Nyambo, "A Deep Learning Model for Predicting Stock Prices in Tanzania," Engineering, Technology & Applied Science Research, vol. 13, no. 2, pp. 10517–10522, Apr. 2023. DOI: https://doi.org/10.48084/etasr.5710
Downloads
How to Cite
License
Copyright (c) 2023 Humaira Rana, Muhammad Umer Farooq , Abdul Karim Kazi , Mirza Adnan Baig , Muhammad Ali Akhtar
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.