Improving Agricultural Commodity Trading through Data Imputation Methods for Price Prediction Accuracy
Автор: Pattharaporn Thongnim, Sueppong Mueanchamnong
Журнал: International Journal of Information Engineering and Electronic Business @ijieeb
Статья в выпуске: 2 vol.18, 2026 года.
Бесплатный доступ
Agricultural price prediction in developing regions faces significant challenges from missing data in Internet of Things (IoT)-based environmental monitoring systems, particularly in tropical fruit cultivation where sensors frequently experience connectivity and operational failures. This study evaluates the impact of missing data imputation methods on agricultural price prediction model performance using environmental and market data from a commercial durian orchard in Chanthaburi Province, Thailand (2023-2024). Three imputation strategies—Linear Interpolation, Prophet, and Kalman Filter—were systematically compared across four machine learning algorithms (Regression Trees, Random Forest, XGBoost, and Artificial Neural Networks) using 10-fold cross-validation. The dataset comprised 182 observations with 28.02% missing environmental data and 68.13% missing price data, representing realistic constraints in developing agricultural economies. Results demonstrated that XGBoost consistently achieved superior performance across all imputation methods, with Kalman Filter combined with XGBoost showing the best testing performance (R² = 0.9767, MSE = 0.0013, MAE = 0.0287, MAPE = 1.49%). However, these results require careful interpretation given the limited sample size, high missingness, and potential temporal data leakage from random train-test splitting. Time series visualization revealed distinct characteristics: Linear Interpolation provided computational efficiency but oversimplified data complexity, Prophet captured seasonal patterns but introduced excessive noise, while Kalman Filter offered balanced performance preserving both smoothness and natural variability. Practical price prediction analysis showed substantial variations up to 35 Thai Baht per kilogram between imputation methods. The findings provide methodological evidence for imputation strategy selection in agricultural IoT systems with missing data, though validation with larger multi-site datasets is essential before operational deployment.
Missing Data, Imputation, Machine Learning, Internet of Things, Kalman Filter, Prophet Imputation, Moving Average
Короткий адрес: https://sciup.org/15020244
IDR: 15020244 | DOI: 10.5815/ijieeb.2026.02.02