Machine Learning Model for Predicting Rice Crop Yield: A Case Study in Hadejia and Auyo, Nigeria
DOI:
https://doi.org/10.56919/usci.2541.024Keywords:
Gradient Boosting, FAO, MSE, RMSE, R2Abstract
Accurate crop yield prediction is essential for addressing food security challenges, particularly in regions facing climatic variability and resource constraints. This study proposes a machine learning–based framework for rice yield prediction in Hadejia and Auyo, Jigawa State, Nigeria, by integrating soil properties, irrigation methods, water usage, fertilization practices, pest infestation data, and local weather variables. Four ensemble learning algorithms, Random Forest, Gradient Boosting, XGBoost, and LightGBM, were trained and evaluated using both a traditional 80/20 hold-out split and k-fold cross-validation to ensure robust performance assessment. Among these models, Random Forest achieved the highest predictive accuracy, recording an R² of 0.9529 and RMSE of 1.1118, demonstrating its effectiveness in capturing complex, non-linear interactions among agronomic factors. The proposed approach underscores the value of localized data, offering farmers, policymakers, and stakeholders a scalable decision-support tool for optimizing resource allocation, mitigating risks, and enhancing overall agricultural productivity. This research provides a practical roadmap for precision agriculture initiatives in Jigawa State and other regions with similar agroecological conditions by illustrating how comprehensive feature integration and ensemble-based machine learning can significantly improve yield forecasts.
References
Abu Al-Haija, M., & Krichen, W. (2022). Machine-learning-based Darknet traffic detection system for IoT applications. Electronics, 11(4), 556. https://doi.org/10.3390/electronics11040556 DOI: https://doi.org/10.3390/electronics11040556
Agarwal, & Tarar, S. (2021). A hybrid approach for crop yield prediction using machine learning and deep learning algorithms. Journal of Physics: Conference Series, 1714(1), 012012. https://doi.org/10.1088/1742-6596/1714/1/012012 DOI: https://doi.org/10.1088/1742-6596/1714/1/012012
Alexandros, O., Catal, C., & Kassahun, A. (2022). Hybrid deep learning-based models for crop yield prediction. Applied Artificial Intelligence, 36(1). https://doi.org/10.1080/08839514.2022.2031823 DOI: https://doi.org/10.1080/08839514.2022.2031823
Alibabaei, S., Ghahremani, M., & Omid, M. (2021). Integration of maximum crop response with machine learning algorithms for crop yield prediction. Geo-spatial Information Science, 24(2), 241–252.
Aravind, S., & Indumathi, T. (2021). A comprehensive review on gradient boosting models for classification. Materials Today: Proceedings, 37, 3203–3206.
Archana, & Senthil, K. P. (2023). A survey on deep learning-based crop yield prediction. Nature Environment and Pollution Technology, 22(2). https://doi.org/10.46488/NEPT.2023.v22i02.004 DOI: https://doi.org/10.46488/NEPT.2023.v22i02.004
Aworka, R., Adoni, W. Y. H., Zoueu, J. T., Mutombo, F. K., Krichen, M., & Kimpolo, C. L. M. (2022). Agricultural decision system based on advanced machine learning models for yield prediction: Case of East African countries. Smart Agricultural Technology, 2, 100048. https://doi.org/10.1016/j.atech.2022.100048 DOI: https://doi.org/10.1016/j.atech.2022.100048
Bhimavarapu, U., Battineni, G., & Chintalapudi, N. (2022). Improved optimization algorithm in LSTM to predict crop yield. Computers, 12(1), 10. https://doi.org/10.3390/computers12010010 DOI: https://doi.org/10.3390/computers12010010
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324 DOI: https://doi.org/10.1023/A:1010933404324
Chakraborty, S., Ghosh, P., & Singh, R. (2022). Usability of the weather forecast for tackling climatic variability and its effect on maize crop yield in Northeastern Hill Region of India. Agronomy, 12(1), 18. https://doi.org/10.3390/agronomy12010018 DOI: https://doi.org/10.3390/agronomy12102529
Charoen-Ung, P., & Mittrapiyanuruk, P. (2018). Sugarcane yield grade prediction using random forest and gradient boosting tree techniques. In 2018 15th International Joint Conference on Computer Science and Software Engineering (JCSSE) (pp. 1–6). IEEE. https://doi.org/10.1109/JCSSE.2018.8457391 DOI: https://doi.org/10.1109/JCSSE.2018.8457391
Chen, K., O'Leary, R. A., & Evans, F. H. (2019). A simple and parsimonious generalized additive model for predicting wheat yield in a decision support tool. Computers and Electronics in Agriculture, 162, 651–656.
Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). https://doi.org/10.1145/2939672.2939785 DOI: https://doi.org/10.1145/2939672.2939785
Chlingaryan, A., Sukkarieh, S., & Whelan, B. (2018). Machine learning approaches for crop yield prediction and nitrogen status estimation in precision agriculture: A review. Computers and Electronics in Agriculture, 151, 61–69. https://doi.org/10.1016/j.compag.2018.05.012 DOI: https://doi.org/10.1016/j.compag.2018.05.012
Deepa, S. N., & Sivaselvan, B. (2019). Prediction of the compressive strength of high-performance concrete mix using tree-based modeling. Ain Shams Engineering Journal, 10(2), 297–304.
Egbunu, M. T., Ogedengbe, T. S., Yange, T., & Gbaden, T. (2021). Towards food security: The prediction of climatic factors in Nigeria using random forest approach. Journal of Computer Science and Information Technology, 7(4), 70–80. https://doi.org/10.35134/jcsitech.v7i4.15 DOI: https://doi.org/10.35134/jcsitech.v7i4.15
Elavarasan, D., & Vincent, P. M. D. (2020). Crop yield prediction using deep reinforcement learning model for sustainable agrarian applications. IEEE Access, 8, 86886–86901. https://doi.org/10.1109/ACCESS.2020.2992480 DOI: https://doi.org/10.1109/ACCESS.2020.2992480
Eli, A. J., Umar, I., & Akinyemi, M. (2023). Rice yield forecasting: A comparative analysis of multiple machine learning algorithms. Journal of Information Systems and Informatics, 5(2). https://doi.org/10.51519/journalisi.v5i2.506 DOI: https://doi.org/10.51519/journalisi.v5i2.506
Ferrer, A., Martínez, B., & Gómez, J. (2020). Crop yield estimation and interpretability with Gaussian processes. Frontiers in Remote Sensing, 2, 1010978.
Gao, Y., Wang, S., Guan, K., Wolanin, A., You, L., Ju, W., & Zhang, Y. (2020). The ability of sun-induced chlorophyll fluorescence from OCO-2 and MODIS-EVI to monitor spatial variations of soybean and maize yields in the Midwestern USA. Remote Sensing, 12(7), 1111. https://doi.org/10.3390/rs12071111 DOI: https://doi.org/10.3390/rs12071111
Gopal, P. S. M., & Bhargavi, R. (2019). A novel approach for efficient crop yield prediction. Computers and Electronics in Agriculture, 165, 104968. https://doi.org/10.1016/j.compag.2019.104968 DOI: https://doi.org/10.1016/j.compag.2019.104968
Jiya, U., Iliyasu, A., & Ebem, D. U. (2023). Agricultural research and food security under climate change: The place of machine learning models. Journal of Advanced Mathematics and Computer Science, 11(1). https://doi.org/10.22624/AIMS/MATHS/V11N1P2
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T.-Y. (2017). LightGBM: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems (pp. 3146–3154).
Khan, M., Khan, S., & Khan, M. Z. (2021). Optimizing soil fertility through machine learning: Enhancing agricultural productivity and sustainability. Journal of Agricultural Informatics, 12(3), 45–58.
Kheir, A. M. S., Negm, A., & El-Bastawesy, M. (2021). Remote sensing and GIS for estimating crop water consumption in dry environments: A case study of the Nile Delta region. Remote Sensing Applications: Society and Environment, 22, 100474. https://doi.org/10.1016/j.rsase.2021.100474 DOI: https://doi.org/10.1016/j.rsase.2021.100474
Mamatha, & Kavitha, J. C. (2023). Machine learning-based crop growth management in greenhouse environment using hydroponics farming techniques. Measurement: Sensors, 25, 100665. https://doi.org/10.1016/j.measen.2023.100665 DOI: https://doi.org/10.1016/j.measen.2023.100665
Martini, M., Offermann, F., Söder, M., Frühauf, C., & Finger, R. (2022). Machine learning can guide food security efforts when primary data are not available. Nature Food, 3(9), 716–728. https://doi.org/10.1038/s43016-022-00587-8 DOI: https://doi.org/10.1038/s43016-022-00587-8
Meng, Q., Hou, P., & Li, T. (2021). Integrating random forest and crop modeling improves the crop yield prediction of winter wheat and oilseed rape. Frontiers in Remote Sensing, 2, 1010978.
Muhammed I., Khan, S., & Khan, M. Z. (2021). Optimizing Soil Fertility through Machine Learning: Enhancing Agricultural Productivity and Sustainability. Journal of Agricultural Informatics, 12(3), 45-58.
Paudel, U., Adhikari, R., & Shrestha, S. (2021). A comparative analysis of machine learning models for rice yield prediction in Nepal. Heliyon, 7(3), e06404.
Pedamkar, P. (2020). Random forest in machine learning. Analytics Vidhya. Retrieved from https://www.analyticsvidhya.com/
Prasad, S., Chawla, I., & Ghosh, S. (2021). Integrating satellite data and machine learning techniques for crop yield prediction: A case study of rice in India. Remote Sensing Applications: Society and Environment, 22, 100482.
Ramesh, A., Hebbar, V., Yadav, T., Gunta, A., & Balachandra, A. (2022). CYPUR-NN: Crop yield prediction using regression and neural networks. In Emerging research in computing, information, communication and applications (ERCICA 2020). Springer. https://doi.org/10.1007/978-981-16-8126-1_16 DOI: https://doi.org/10.1007/978-981-16-1338-8_17
Seungtaek, & Tarar, S. (2021). A hybrid approach for crop yield prediction using machine learning and deep learning algorithms. Journal of Physics: Conference Series, 1714(1), 012012. https://doi.org/10.1088/1742-6596/1714/1/012012 DOI: https://doi.org/10.1088/1742-6596/1714/1/012012
Shahhosseini, M., Hu, G., Huber, I., Archontoulis, S. V., & Laird, D. (2021). A comprehensive review of crop yield prediction using machine learning. Frontiers in Plant Science, 12, 616605. https://doi.org/10.3389/fpls.2021.616605
Shuaibu, M. N., Muhammad, N., & Abu-Safyan, Y. (2021). Forecasting rice production in Jigawa State, Nigeria using fuzzy inference system. Dutse Journal of Pure and Applied Sciences, 7(4), 203–213. DOI: https://doi.org/10.4314/dujopas.v7i4b.21
Singh, A., Kumar, P., & Kumar, A. (2022). Machine learning-based crop yield prediction: A survey. Journal of King Saud University - Computer and Information Sciences, 34(5), 1297–1313.
Sun, Y., Wang, S., & Tang, X. (2019). Outlier detection based on clustering by fast search and find of density peaks. Information Sciences, 480, 354–364.
United Nations. (2021). The Sustainable Development Goals Report 2021. United Nations.
Van Oort, B. G. H., Timmermans, F., Schils, R. L. M., & van Eekeren, N. (2023). Recent weather extremes and their impact on crop yields of the Netherlands. European Journal of Agronomy, 142, 126662. https://doi.org/10.1016/j.eja.2022.126662 DOI: https://doi.org/10.1016/j.eja.2022.126662
Wickramasinghe, R., Weliwatta, P., Ekanayake, P., & Jayasinghe, J. (2021). Modeling the relationship between rice yield and climate variables using statistical and machine learning techniques. Journal of Mathematics, 2021, 6646126. https://doi.org/10.1155/2021/6646126 DOI: https://doi.org/10.1155/2021/6646126
World Health Organization. (2021). World health statistics 2021: Monitoring health for the SDGs, sustainable development goals. World Health Organization.
Zhang, G. P. (2006). Avoiding pitfalls in neural network research. IEEE Transactions on Systems, Man, and Cybernetics - Part C: Applications and Reviews, 37(1), 3–16. https://doi.org/10.1109/TSMCC.2006.876059 DOI: https://doi.org/10.1109/TSMCC.2006.876059
Zhang, Z., Wu, R. M. X., Yan, W., Fan, J., Gou, J., & Liu, B. (2022). A comparative analysis of the principal component analysis and entropy weight methods to establish the indexing measurement. PLOS ONE, 17(1), e0262261. https://doi.org/10.1371/journal.pone.0262261 DOI: https://doi.org/10.1371/journal.pone.0262261
Zhi, X., Cao, Z., Zhang, T., Qin, L., Qi, L., Ge, A., Guo, X., Wang, C., Da, Y., Sun, W., & Liu, Y. (2022). Identifying the determinants of crop yields in China since 1952 and its policy implications. Agricultural and Forest Meteorology, 327, 109216. https://doi.org/10.1016/j.agrformet.2022.109216 DOI: https://doi.org/10.1016/j.agrformet.2022.109216
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
UMYU Scientifica recognizes the importance of protecting authors’ intellectual property while promoting the free exchange of scientific knowledge. The journal adopts a copyright-retention model that empowers authors to maintain ownership of their work while granting the journal rights necessary for publication and dissemination.
1. Copyright Ownership
Authors publishing with UMYU Scientifica retain full copyright and publishing rights to their work. By submitting a manuscript, authors agree to grant the journal a non-exclusive license to publish, reproduce, distribute, and archive the article in all forms and media for the purpose of scholarly communication.
2. Licensing Terms
All articles are published under the Creative Commons Attribution–NonCommercial (CC BY-NC) license.
This license permits others to:
- Share - copy and redistribute the material in any medium or format.
- Adapt - remix, transform, and build upon the material.
- For non-commercial purposes only, provided that proper credit is given to the original author(s) and UMYU Scientifica as the source, a link to the license is provided, and any modifications are clearly indicated.
Commercial reuse or distribution of the content requires written permission from both the author and the editorial office.
3. Author Rights
Authors are free to:
- Deposit all versions of their manuscript (preprint, accepted version, and published version) in institutional, disciplinary, or public repositories without embargo.
- Use and distribute their published article for non-commercial scholarly purposes, including teaching, conference presentations, and research sharing.
- Include their work in future books, theses, or compilations, provided proper citation to the journal is made.
4. Publisher’s Rights
Upon publication, UMYU Scientifica retains the right to:
- Host, index, and disseminate the article through the journal’s website and partner databases.
- Archive the content in long-term preservation systems such as the PKP Preservation Network (PKP-PN) and the Umaru Musa Yar’adua University Institutional Repository.
5. Attribution and Citation
Users must give appropriate credit to the author(s), include a link to the article’s DOI or the journal webpage, and indicate if changes were made. Proper citation is required whenever the work is reused or referenced.
6. License Reference
For detailed terms of use, please refer to the Creative Commons Attribution–NonCommercial 4.0 International License (CC BY-NC 4.0):
https://creativecommons.org/licenses/by-nc/4.0/









